Page cover

Collaborative Filtering

Consider a list of movies as "items". Each movie has an associated rating which we will call "score". With this, we write the following denotion:

s(j)=scoreforitemjwhere,j=1...M,ifthereareMitems.s(j) = score \hspace{0.1cm} for \hspace{0.1cm} item \hspace{0.1cm} j \\ where, \hspace{0.1cm} j = 1 \hspace{0.1cm} ... \hspace{0.1cm} M, \hspace{0.1cm} if \hspace{0.1cm} there \hspace{0.1cm} are \hspace{0.1cm} M \hspace{0.1cm} items.

​Basic algorithm is to make average rating for item jj​:

s(j)=∑i∈Ωjri,j∣Ωj∣where,Ωj≡setofalluserswhorateditemjri,j≡rating,userigaveitemjs(j) = \frac{\sum_{i\isin\Omega_j} r_{i,j}}{|\Omega_j|} \\ where, \Omega_j \hspace{0.1cm} \equiv set \hspace{0.1cm} of \hspace{0.1cm} all \hspace{0.1cm} users \hspace{0.1cm} who \hspace{0.1cm} rated \hspace{0.1cm} item \hspace{0.1cm} j \\ r_{i,j} \equiv rating, \hspace{0.1cm} user \hspace{0.1cm} i \hspace{0.1cm} gave \hspace{0.1cm} item \hspace{0.1cm} j

​Score Personalisation

Score s(i,j)s(i, j) can depend on both user ii​ and item (e.g. movie) jj.

s(i,j)=∑i‘∈Ωjri‘j∣Ωj∣wherei=1...N,N≡No.ofuserss(i,j) = \frac{\sum_{i^` \isin \Omega_j} r_{i^`j}}{|\Omega_j|} \\ where \hspace{0.1cm} i = 1...N, \hspace{0.5cm} N \equiv No. \hspace{0.1cm} of \hspace{0.1cm} users

The above equation does not change anything from the equation to calculate the average rating as the score still does not depend on user ii​. So, every user still sees the same score for each item.

The above equation is actually meant to introduce a new convention and symbols which we would modify later on to introduce personalisation. We would observe this in an upcoming blog article.

Ratings Matrix

The central object for any recommender systems is the ratings matrix. It is defined as follows:

ri,j=ratinguserIgaveitemjwherei=1...N,j=1...MRN×M≡user−itemratingsmatrixofsizeN×Mr_{i,j} = rating \hspace{0.1cm} user \hspace{0.1cm} I \hspace{0.1cm} gave \hspace{0.1cm} item \hspace{0.1cm} j \\ where \hspace{0.1cm} i = 1...N, \hspace{0.1cm} j = 1...M \\ R_{N \times M} \equiv user-item \hspace{0.1cm} ratings \hspace{0.1cm} matrix \hspace{0.1cm} of \hspace{0.1cm} size \hspace{0.1cm} N \times M

Sparsity

User-item ratings matrix is sparse because most entries are generally empty. Below is an example of a dummy ratings matrix. Each row represents a user and each column represents an item (e.g. movie).

[∗4.52.0∗4.0∗3.5∗∗5.0∗2.0∗3.54.01.0]\begin{bmatrix} * & 4.5 & 2.0 & *\\ 4.0 & * & 3.5 & *\\ * & 5.0 & * & 2.0\\ * & 3.5 & 4.0 & 1.0 \end{bmatrix}

∗* marked entries in the above matrix mean, there are no ratings (score) for the specific user-item pair.

Collaborative Filtering Goal

We want to make recommendations. Thus, its a good thing that the ratings matrix generally is sparse, in order to actually have items to recommend.

We want to guess what a user might rate a movie, the user has not seen yet. Thus,

s(i,j)≡rˆ(i,j)≡guesswhatuserimightrateitemjs(i,j) \equiv \^{r}(i,j) \equiv guess \hspace{0.1cm} what \hspace{0.1cm} user \hspace{0.1cm} i \hspace{0.1cm} might \hspace{0.1cm} rate \hspace{0.1cm} item \hspace{0.1cm} j

​We want to predict a real number, so objective of this task would be "mean squared error" (MSE).

MSE=1∣Ω∣∑i,j∈Ω(rij−rˆij)2where,Ω≡Setofpairs(i,j)whereuserihasrateditemj.MSE = \frac{1}{|\Omega|} \sum_{i,j \isin \Omega} (r_{ij} - \^{r}_{ij})^2 \\ where, \Omega \equiv Set \hspace{0.1cm} of \hspace{0.1cm} pairs \hspace{0.1cm} (i,j) \hspace{0.1cm} where \hspace{0.1cm} user \hspace{0.1cm} i \hspace{0.1cm} has \hspace{0.1cm} rated \hspace{0.1cm} item \hspace{0.1cm} j.

Collborative Filtering Types

There are 2 different ways to apply collaborative filtering for making recommendations:

  • User-User Collaborative Filtering

  • Item-Item Collaborative Filtering

Last updated