arrow_upward

MrMineev

🔍 Recommendation Systems

§

Collaborative filtering

You can describe the basic concept of collaborative filtering in one sentence:

Past agreement leads to future alignment.

This means that if two people liked the same items multiple times in the past, they will probably both like the same items in the future. So the algorithm would work by storing the review of each user on each item (0 if the user didn’t rate the item). And to recommend items to user A, we will find other users, like B and C, that are similar to A and recommend what B and C liked to A. When I say similar I mean there is some function simil(A, B) representing the similarity rate between users A and B. So the probable user u rating of the i-th item is an aggregation of some similar users' rating of the item:

`r_(u, i) = aggr_{u' ∈ U} (r_(u', i))`

There are a lot of different aggr functions. The Pearson correlation and vector cosine-based similarity are the most popular. I will only define these functions:

1) Cosine Similarity

`S_(C)(A, B) = (A * B) / (||A|| * ||B||) = (sum_(i=1)^n A_(i)B_(i))/sqrt((sum_(i=1)^n A_(i)^2) * (sum_(i=1)^n B_(i)^2))`

2) Pearson correlation

`ρ_(A, B) = (sum_(i=1)^n (A_(i) - bar A)(B_(i) - bar B)) / (sqrt(sum_(i=1)^n (A_(i) - bar A)^2) sqrt(sum_(i=1)^n (B_(i) - bar B)^2))`

Here A and B are some rx and ry, respectively. To keep everything simple, I will be using the Consine Similarity. Let's go through an example to see how collaborative filtering works. In this example, ratings range from 1 to 10.


User 1 User 2 User 3 User 4 User 5 User 6 User 7 User 8 User 9 User 10
Item 1 5 2 0 0 9 5 2 0 0 9
Item 2 10 0 7 2 1 10 0 7 2 1
Item 3 0 0 5 8 0 0 0 5 8 0
Item 4 1 9 8 0 0 1 9 8 0 0
Item 5 0 6 0 0 10 0 6 0 0 10
Item 6 10 10 4 0 8 10 10 4 0 8
Item 7 0 3 2 5 1 0 3 2 5 1
Item 8 0 3 2 5 1 0 3 2 5 1
Item 9 0 3 2 5 1 0 3 2 5 1
Item 10 0 3 2 5 1 0 3 2 5 1
Item 11 0 3 2 5 1 0 3 2 5 1
Item 12 0 3 2 5 1 0 3 2 5 1

The goal is to recommend some items to user 1. First, calculate the cosine similarity of all the users and user number 1.

Similarity to User 1
User 1 UNKNOWN
User 2 UNKNOWN
User 3 UNKNOWN
User 4 UNKNOWN
User 5 UNKNOWN
User 6 UNKNOWN
User 7 UNKNOWN
User 8 UNKNOWN
User 9 UNKNOWN
User 10 UNKNOWN


Here 1 represents complete similarity and 0 no similarity. How using this information, can we predict ru, i? One trendy function for doing so is:

`r_(u, i) = 1 / N sum_(u' ∈ U) r_(u', i)`

Where U denotes the set of top N similar users to u.






Predictions for User 1
Item 1 UNKNOWN
Item 2 UNKNOWN
Item 3 UNKNOWN
Item 4 UNKNOWN
Item 5 UNKNOWN
Item 6 UNKNOWN
Item 7 UNKNOWN
Item 8 UNKNOWN
Item 9 UNKNOWN
Item 10 UNKNOWN
Item 11 UNKNOWN
Item 12 UNKNOWN


Now to recommend items, just look at the items with the most significant predictions that aren't viewed.



§

Links


  https://github.com/MrMineev/Different-Recommendation-Systems

  https://en.wikipedia.org/wiki/Cosine_similarity

  https://en.wikipedia.org/wiki/Pearson_correlation_coefficient


§

💬 Comments