Article | MrMineev

§

Collaborative filtering

You can describe the basic concept of collaborative filtering in one sentence:

Past agreement leads to future alignment.

This means that if two people liked the same items multiple times in the past, they will probably both like the same items in the future. So the algorithm would work by storing the review of each user on each item (0 if the user didn’t rate the item). And to recommend items to user A, we will find other users, like B and C, that are similar to A and recommend what B and C liked to A. When I say similar I mean there is some function simil(A, B) representing the similarity rate between users A and B. So the probable user u rating of the i-th item is an aggregation of some similar users' rating of the item:

`r_(u, i) = aggr_{u' ∈ U} (r_(u', i))`

There are a lot of different aggr functions. The Pearson correlation and vector cosine-based similarity are the most popular. I will only define these functions:

1) Cosine Similarity

`S_(C)(A, B) = (A * B) / (||A|| * ||B||) = (sum_(i=1)^n A_(i)B_(i))/sqrt((sum_(i=1)^n A_(i)^2) * (sum_(i=1)^n B_(i)^2))`

2) Pearson correlation

`ρ_(A, B) = (sum_(i=1)^n (A_(i) - bar A)(B_(i) - bar B)) / (sqrt(sum_(i=1)^n (A_(i) - bar A)^2) sqrt(sum_(i=1)^n (B_(i) - bar B)^2))`

Here A and B are some r_x and r_y, respectively. To keep everything simple, I will be using the Consine Similarity. Let's go through an example to see how collaborative filtering works. In this example, ratings range from 1 to 10.

	User 1	User 2	User 3	User 4	User 5	User 6	User 7	User 8	User 9	User 10
Item 1	5	2	0	0	9	5	2	0	0	9
Item 2	10	0	7	2	1	10	0	7	2	1
Item 3	0	0	5	8	0	0	0	5	8	0
Item 4	1	9	8	0	0	1	9	8	0	0
Item 5	0	6	0	0	10	0	6	0	0	10
Item 6	10	10	4	0	8	10	10	4	0	8
Item 7	0	3	2	5	1	0	3	2	5	1
Item 8	0	3	2	5	1	0	3	2	5	1
Item 9	0	3	2	5	1	0	3	2	5	1
Item 10	0	3	2	5	1	0	3	2	5	1
Item 11	0	3	2	5	1	0	3	2	5	1
Item 12	0	3	2	5	1	0	3	2	5	1

The goal is to recommend some items to user 1. First, calculate the cosine similarity of all the users and user number 1.

	Similarity to User 1
User 1	UNKNOWN
User 2	UNKNOWN
User 3	UNKNOWN
User 4	UNKNOWN
User 5	UNKNOWN
User 6	UNKNOWN
User 7	UNKNOWN
User 8	UNKNOWN
User 9	UNKNOWN
User 10	UNKNOWN

Here 1 represents complete similarity and 0 no similarity. How using this information, can we predict r_{u, i}? One trendy function for doing so is:

`r_(u, i) = 1 / N sum_(u' ∈ U) r_(u', i)`

Where U denotes the set of top N similar users to u.

Value of N:

1 2 3 4 5 6 7 8 9 10

	Predictions for User 1
Item 1	UNKNOWN
Item 2	UNKNOWN
Item 3	UNKNOWN
Item 4	UNKNOWN
Item 5	UNKNOWN
Item 6	UNKNOWN
Item 7	UNKNOWN
Item 8	UNKNOWN
Item 9	UNKNOWN
Item 10	UNKNOWN
Item 11	UNKNOWN
Item 12	UNKNOWN

Now to recommend items, just look at the items with the most significant predictions that aren't viewed.

§

Links

https://github.com/MrMineev/Different-Recommendation-Systems

https://en.wikipedia.org/wiki/Cosine_similarity

https://en.wikipedia.org/wiki/Pearson_correlation_coefficient

MrMineev

🔍 Recommendation Systems

§

Collaborative filtering

§

Links

§

💬 Comments