Consider a list of movies as "items". Each movie has an associated rating which we will call "score". With this, we write the following denotion:
s ( j ) = s c o r e f o r i t e m j w h e r e , j = 1 . . . M , i f t h e r e a r e M i t e m s . s(j) = score \hspace{0.1cm} for \hspace{0.1cm} item \hspace{0.1cm} j
\\
where, \hspace{0.1cm} j = 1 \hspace{0.1cm} ... \hspace{0.1cm} M, \hspace{0.1cm} if \hspace{0.1cm} there \hspace{0.1cm} are \hspace{0.1cm} M \hspace{0.1cm} items. s ( j ) = score f or i t e m j w h ere , j = 1 ... M , i f t h ere a re M i t e m s . ​Basic algorithm is to make average rating for item j j j ​:
s ( j ) = ∑ i ∈ Ω j r i , j ∣ Ω j ∣ w h e r e , Ω j ≡ s e t o f a l l u s e r s w h o r a t e d i t e m j r i , j ≡ r a t i n g , u s e r i g a v e i t e m j s(j) = \frac{\sum_{i\isin\Omega_j} r_{i,j}}{|\Omega_j|}
\\
where, \Omega_j \hspace{0.1cm} \equiv set \hspace{0.1cm} of \hspace{0.1cm} all \hspace{0.1cm} users \hspace{0.1cm} who \hspace{0.1cm} rated \hspace{0.1cm} item \hspace{0.1cm} j
\\
r_{i,j} \equiv rating, \hspace{0.1cm} user \hspace{0.1cm} i \hspace{0.1cm} gave \hspace{0.1cm} item \hspace{0.1cm} j s ( j ) = ∣ Ω j ​ ∣ ∑ i ∈ Ω j ​ ​ r i , j ​ ​ w h ere , Ω j ​ ≡ se t o f a ll u sers w h o r a t e d i t e m j r i , j ​ ≡ r a t in g , u ser i g a v e i t e m j ​Score Personalisation
Score s ( i , j ) s(i, j) s ( i , j ) can depend on both user i i i ​ and item (e.g. movie) j j j .
s ( i , j ) = ∑ i ‘ ∈ Ω j r i ‘ j ∣ Ω j ∣ w h e r e i = 1... N , N ≡ N o . o f u s e r s s(i,j) = \frac{\sum_{i^` \isin \Omega_j} r_{i^`j}}{|\Omega_j|}
\\
where \hspace{0.1cm} i = 1...N, \hspace{0.5cm} N \equiv No. \hspace{0.1cm} of \hspace{0.1cm} users s ( i , j ) = ∣ Ω j ​ ∣ ∑ i ‘ ∈ Ω j ​ ​ r i ‘ j ​ ​ w h ere i = 1... N , N ≡ N o . o f u sers The above equation does not change anything from the equation to calculate the average rating as the score still does not depend on user i i i ​. So, every user still sees the same score for each item.
The above equation is actually meant to introduce a new convention and symbols which we would modify later on to introduce personalisation. We would observe this in an upcoming blog article.
Ratings Matrix
The central object for any recommender systems is the ratings matrix. It is defined as follows:
r i , j = r a t i n g u s e r I g a v e i t e m j w h e r e i = 1... N , j = 1... M R N × M ≡ u s e r − i t e m r a t i n g s m a t r i x o f s i z e N × M r_{i,j} = rating \hspace{0.1cm} user \hspace{0.1cm} I \hspace{0.1cm} gave \hspace{0.1cm} item \hspace{0.1cm} j
\\
where \hspace{0.1cm} i = 1...N, \hspace{0.1cm} j = 1...M
\\
R_{N \times M} \equiv user-item \hspace{0.1cm} ratings \hspace{0.1cm} matrix \hspace{0.1cm} of \hspace{0.1cm} size \hspace{0.1cm} N \times M r i , j ​ = r a t in g u ser I g a v e i t e m j w h ere i = 1... N , j = 1... M R N × M ​ ≡ u ser − i t e m r a t in g s ma t r i x o f s i ze N × M Sparsity
User-item ratings matrix is sparse because most entries are generally empty. Below is an example of a dummy ratings matrix. Each row represents a user and each column represents an item (e.g. movie).
[ ∗ 4.5 2.0 ∗ 4.0 ∗ 3.5 ∗ ∗ 5.0 ∗ 2.0 ∗ 3.5 4.0 1.0 ] \begin{bmatrix}
* & 4.5 & 2.0 & *\\
4.0 & * & 3.5 & *\\
* & 5.0 & * & 2.0\\
* & 3.5 & 4.0 & 1.0
\end{bmatrix} ​ ∗ 4.0 ∗ ∗ ​ 4.5 ∗ 5.0 3.5 ​ 2.0 3.5 ∗ 4.0 ​ ∗ ∗ 2.0 1.0 ​ ​ ∗ * ∗ marked entries in the above matrix mean, there are no ratings (score) for the specific user-item pair.
Collaborative Filtering Goal
We want to make recommendations. Thus, its a good thing that the ratings matrix generally is sparse, in order to actually have items to recommend.
We want to guess what a user might rate a movie, the user has not seen yet. Thus,
s ( i , j ) ≡ r ˆ ( i , j ) ≡ g u e s s w h a t u s e r i m i g h t r a t e i t e m j s(i,j) \equiv \^{r}(i,j) \equiv guess \hspace{0.1cm} what \hspace{0.1cm} user \hspace{0.1cm} i \hspace{0.1cm} might \hspace{0.1cm} rate \hspace{0.1cm} item \hspace{0.1cm} j s ( i , j ) ≡ r ˆ ( i , j ) ≡ gu ess w ha t u ser i mi g h t r a t e i t e m j ​We want to predict a real number, so objective of this task would be "mean squared error" (MSE).
M S E = 1 ∣ Ω ∣ ∑ i , j ∈ Ω ( r i j − r ˆ i j ) 2 w h e r e , Ω ≡ S e t o f p a i r s ( i , j ) w h e r e u s e r i h a s r a t e d i t e m j . MSE = \frac{1}{|\Omega|} \sum_{i,j \isin \Omega} (r_{ij} - \^{r}_{ij})^2
\\
where, \Omega \equiv Set \hspace{0.1cm} of \hspace{0.1cm} pairs \hspace{0.1cm} (i,j) \hspace{0.1cm} where \hspace{0.1cm} user \hspace{0.1cm} i \hspace{0.1cm} has \hspace{0.1cm} rated \hspace{0.1cm} item \hspace{0.1cm} j. MSE = ∣Ω∣ 1 ​ i , j ∈ Ω ∑ ​ ( r ij ​ − r ˆ ij ​ ) 2 w h ere , Ω ≡ S e t o f p ai rs ( i , j ) w h ere u ser i ha s r a t e d i t e m j . Collborative Filtering Types
There are 2 different ways to apply collaborative filtering for making recommendations:
User-User Collaborative Filtering
Item-Item Collaborative Filtering