9 Collaborative Filtering

Collaborative filtering (CF) is a method used in building recommender systems on big data. Common applications include Amazon product recommendations, Netflix movies and shows recommendations, iTunes music recommendations, etc. Figures 9.1, 9.2, and 9.3 show examples of recommendations for me from these 3 services.

Amazon Recommendations

Figure 9.1: Amazon Recommendations

Netflix Recommendations

Figure 9.2: Netflix Recommendations

CF uses a user-item rating matrix, which contains rating given by users to items. Beyond this, CF does not require other user information such as demographics (e.g., age, sex, etc.) or item information (e.g., movie genre, type of music, etc.). This is a key strength of CF because it relies on minimal information and doesn’t raise major privacy concerns as it does not need to know any personal information about the user outside their previous behavior. CF doesn’t rely on the content analysis of the items, which makes it easy to build recommenders for applications which may otherwise require complex machine learning models. For instance, CF can be used to recommend jokes without actually knowing what those jokes are!

iTunes Recommendations

Figure 9.3: iTunes Recommendations

As the name suggests, the “collaborative” part means that the method relies on behavior by other users with similar tastes. The user-item rating matrix does not explicitly show which users are similar or which items are similar. The objective of CF is then to identify these similarities.

9.0.1 An example

Imagine that you and your friend A decide to meet for lunch. Two of you go to a restaurant where you have not eaten before. But A visits this place often because it’s close to her office. You are seated at a table and the server leaves you with menus. You are unsure about what to order so you turn to your friend for suggestions. A recommends a new vegetarian burger called “Beyond burger”. How likely are you to try this burger if…

  1. You and your friend A have very similar preferences for food

  2. You and your friend A have not so similar preferences for food

I guess that you are more likely to order Beyond burger in the first case, that is, when you and A have very similar food preferences. This is in essence what CF entails. However, rather than recommending an item based on only 1 other user’s experience, CF uses a lot of users to make more reliable prediction. Think about it as an app that has food preferences of all your friends. This app can recommend you food items by taking into account the evaluations of all the friends with similar taste as you.