RecSys: An Introduction
Last updated
Last updated
Recommender System is a sytem that predicts ratings or preferences a user might give to an item. Often these are sorted and presented as “top-N” recommendations. There are many flavors of recommenders:
Recommending things
Recommending content
Recommending music
Recommending people
Recommending search results
In general, the goal recommender systems is understanding the end customer of a website or a network of websites that share data with each other. This can be done via implict or explicit ratings or feedback from the customer.
Explicit feedback or ratings can be asking users to rate content (e.g. movies, music) on a scale of 1 to 5 stars or rating content with a thumbs up or a thumbs down. In these cases, we are explicitly asking users about their taste for a particulat content or thing.
Another way to understand user taste is through implicit feedback. For example, if a user clicks on a webpage, that can be interpreted as implicit positive rating for that webpage link. Things or content a user purchases are also a much better indications of positive implicit rating. Finally, another source of implicit data is things a user consumes. For example, Youtube can look up how many minutes a user ends up watching a video as an indication to how much the user liked it.
All the flavors of recommender systems mentioned in the above section are top-N recommenders. Their job is to produce a finite list of the best things to present to a given user.
A lot of recommender system research tends to focus on the problem of predicting a user's ratings for everything they haven't rated already, good or bad, but that's very different from what recommender systems need to do in the real world. Usually, users don't care about your ability to predict how they'll rate some new item. Customers don't want to see your ability to predict their rating for an item. They just want to see things they're likely to love. Thus, ultimately our goal is to put the best content we can find in front of users in the form of a top-N list.
Below is one way how a top-N recommender system might work. Do note though there are many other ways to build a top-N recommender system.
Generally, you start with some data store representing the individual interests of each user. For example, their ratings for movies that they've seen, or implicit ratings such as the stuff they've bought in the past.
In practice this is usually a big, distributed NoSQL data store like Cassandra or MongoDB or memcached etc, because it has to vend lots of data but with very simple queries. Ideally, this interest data is normalized using techniques such as mean centering or Z scores to ensure that the data is comparable between users. But, in the real world your data is often too sparse to normalize it effectively.
The first step is to generate recommendation candidates, items we think might be interesting to the user based on their past behavior.
So, the candidate generation phase might take all of the items a user indicated interest in before, and consult another data store of items that are similar to those items based on aggregate behavior.
For example, let's say you're making recommendations for a user. One might consult the user's database of individual interests and see that the user has liked Star Trek stuff in the past. Based on everyone else's behavior, we come to know that people who like Star Trek also like Star Wars. So, based on the user's interest in Star Trek, the user might get some recommendation candidates that include Star Wars stuff.
In the process of building up those recommendations, one might assign scores to each candidate based on how the user rated the items they came from and how strong the similarities are between the item and the candidates that came from them. One might even filter out candidates at this stage if the score isn't high enough.
Next, we move to candidate ranking. Many candidates will appear more than once and need to be combined together in some way, maybe boosting their score in the process since they keep coming up repeatedly. After that, it can just be a matter of sorting the resulting recommendation candidates by score to get our first cut at a top-N list of recommendations. Although much more complicated approaches exist, such as learning to rank where machine learning is employed to find the optimal ranking of candidates at this stage.
This ranking stage might also have access to more information about the recommendation candidates that it can use, such as average review scores, that can be used to boost results for highly-rated or popular items for example.
Some filtering will be required before presenting the final, sorted list of recommendation candidates to the user. This filtering stage is where we might eliminate recommendations for items the user has already rated, since we don't want to recommend things the user has already seen.
We might also apply a stop-list here to remove items that are potentially offensive to the user, or remove items that are below some minimum quality score or minimum rating threshold.
It's also where we apply the N in top-N recommenders, and cut things off if we have more results than we need. The output of the filtering stage is then handed off to the display layer where a pretty widget of product recommendations is presented to the user.
Generally speaking, the candidate generation, ranking, and filtering will live inside some distributed recommendation web service that the web front-end talks to in the process of rendering a page for a specific user.
This diagram above is a simplified version of what is called item-based collaborative filtering.
Another architecture that's popular with researchers, is to build up a database ahead of time of predicted ratings of every item by every user.
The candidate generation phase is then just retrieving all of the rating predictions for a given user for every item, and ranking is just a matter of sorting them. This requires though to look at every single item in the catalog for every single user. Thus, this isn't very efficient at runtime.
If one has a small catalog of items to recommend, however, this approach isn't entirely unreasonable.