Date: March, 13th, 11h00

Place: Inria Lille-Nord Europe

Abstract: We have all experimented recommender systems on the web, whether itβs for choosing a movie, a book, a restaurant, or a holiday destination. Roughly speaking, a recommender system is a platform that seeks to predict the rating or preference that a user would give to an item.

More formally, consider a set πΌ of items and a set π΄ of Boolean attributes. A Boolean vector ππ with |π΄| components can be associated with every item π in πΌ so that the π-th component of ππ equals 1 if and only if the π-th attribute in π΄ is true for item π. For example, if πΌ is a set of restaurants and the first attribute is βvegetarianβ, then the first component of vector ππ associated with restaurant π in πΌ equals 1 if and only if π is a restaurant offering vegetarian food.

Consider now a set π of users. A Boolean vector ππ’ with |A| components can be associated with every user π’ in π so that the π-th component of ππ’ equals 1 if and only if user π’ has interest for the π-th attribute. In our example, the first component of vector ππ’ associated with a user π’ in π equals 1 if and only if user π’ has interest for vegetarian restaurants. While these vectors ππ’ are not known, we show how resolving sets can predict them, which makes it to possible to recommend to user π’ all items π with a vector ππ closest to ππ’.

We say that a vector π resolves two vectors b and c if the Hamming distance between π and b is different from that between π and c. A subset πΌβ² of items is a resolving set if every two distinct vectors ππ’ and ππ’β² are resolved by at least one vector ππ with π in πΌβ².

Given a resolving set πΌβ²={π1,β¦, ππ} with π items, we can associate a vector ππ’ with π integer components to every user π’ so that the π-th component of ππ’ is the Hamming distance between πππ and ππ’. As a consequence, given two users π’ and π’β², ππ’ not equal to ππ’β² if and only if ππ’ not equal to ππ’β². In other words, it is possible to differentiate between two users with distinct preferences on the basis of their distances to the vectors ππ with π in πΌβ².

We show in this talk that it is easy to determine resolving sets of small size. For example, for 20 attributes (which allows the classification of the items in more than one million categories), resolving sets with 12 items can be determined. In other words, it is sufficient to determine the 12 components of ππ’ for every user π’ to predict the preference vector ππ’ of user π’ among more than one million item types.