OPTIMA Seminar: Alain Hertz, Thursday, March 13th 2019 – How does the web know our preferences?

Date: March, 13th, 11h00

Place: Inria Lille-Nord Europe

Abstract: We have all experimented recommender systems on the web, whether it’s for choosing a movie, a book, a restaurant, or a holiday destination. Roughly speaking, a recommender system is a platform that seeks to predict the rating or preference that a user would give to an item.
More formally, consider a set 𝐼 of items and a set 𝐴 of Boolean attributes. A Boolean vector 𝒙𝑖 with |𝐴| components can be associated with every item 𝑖 in 𝐼 so that the 𝑗-th component of 𝒙𝑖 equals 1 if and only if the 𝑗-th attribute in 𝐴 is true for item 𝑖. For example, if 𝐼 is a set of restaurants and the first attribute is β€œvegetarian”, then the first component of vector 𝒙𝑖 associated with restaurant 𝑖 in 𝐼 equals 1 if and only if 𝑖 is a restaurant offering vegetarian food.
Consider now a set π‘ˆ of users. A Boolean vector π’šπ‘’ with |A| components can be associated with every user 𝑒 in π‘ˆ so that the 𝑗-th component of π’šπ‘’ equals 1 if and only if user 𝑒 has interest for the 𝑗-th attribute. In our example, the first component of vector π’šπ‘’ associated with a user 𝑒 in π‘ˆ equals 1 if and only if user 𝑒 has interest for vegetarian restaurants. While these vectors π’šπ‘’ are not known, we show how resolving sets can predict them, which makes it to possible to recommend to user 𝑒 all items 𝑖 with a vector 𝒙𝑖 closest to π’šπ‘’.
We say that a vector 𝒂 resolves two vectors b and c if the Hamming distance between 𝒂 and b is different from that between 𝒂 and c. A subset 𝐼′ of items is a resolving set if every two distinct vectors π’šπ‘’ and π’šπ‘’β€² are resolved by at least one vector 𝒙𝑖 with 𝑖 in 𝐼′.
Given a resolving set 𝐼′={𝑖1,…, π‘–π‘Ÿ} with π‘Ÿ items, we can associate a vector 𝒛𝑒 with π‘Ÿ integer components to every user 𝑒 so that the π‘˜-th component of 𝒛𝑒 is the Hamming distance between π’™π‘–π‘˜ and π’šπ‘’. As a consequence, given two users 𝑒 and 𝑒′, 𝒛𝑒 not equal to 𝒛𝑒′ if and only if π’šπ‘’ not equal to π’šπ‘’β€². In other words, it is possible to differentiate between two users with distinct preferences on the basis of their distances to the vectors 𝒙𝑖 with 𝑖 in 𝐼′.
We show in this talk that it is easy to determine resolving sets of small size. For example, for 20 attributes (which allows the classification of the items in more than one million categories), resolving sets with 12 items can be determined. In other words, it is sufficient to determine the 12 components of 𝒛𝑒 for every user 𝑒 to predict the preference vector π’šπ‘’ of user 𝑒 among more than one million item types.

Comments are closed.