Date: March, 13th, 11h00

Place: Inria Lille-Nord Europe

Abstract: We have all experimented recommender systems on the web, whether it’s for choosing a movie, a book, a restaurant, or a holiday destination. Roughly speaking, a recommender system is a platform that seeks to predict the rating or preference that a user would give to an item.

More formally, consider a set ? of items and a set ? of Boolean attributes. A Boolean vector ?? with |?| components can be associated with every item ? in ? so that the ?-th component of ?? equals 1 if and only if the ?-th attribute in ? is true for item ?. For example, if ? is a set of restaurants and the first attribute is “vegetarian”, then the first component of vector ?? associated with restaurant ? in ? equals 1 if and only if ? is a restaurant offering vegetarian food.

Consider now a set ? of users. A Boolean vector ?? with |A| components can be associated with every user ? in ? so that the ?-th component of ?? equals 1 if and only if user ? has interest for the ?-th attribute. In our example, the first component of vector ?? associated with a user ? in ? equals 1 if and only if user ? has interest for vegetarian restaurants. While these vectors ?? are not known, we show how resolving sets can predict them, which makes it to possible to recommend to user ? all items ? with a vector ?? closest to ??.

We say that a vector ? resolves two vectors b and c if the Hamming distance between ? and b is different from that between ? and c. A subset ?′ of items is a resolving set if every two distinct vectors ?? and ??′ are resolved by at least one vector ?? with ? in ?′.

Given a resolving set ?′={?1,…, ??} with ? items, we can associate a vector ?? with ? integer components to every user ? so that the ?-th component of ?? is the Hamming distance between ??? and ??. As a consequence, given two users ? and ?′, ?? not equal to ??′ if and only if ?? not equal to ??′. In other words, it is possible to differentiate between two users with distinct preferences on the basis of their distances to the vectors ?? with ? in ?′.

We show in this talk that it is easy to determine resolving sets of small size. For example, for 20 attributes (which allows the classification of the items in more than one million categories), resolving sets with 12 items can be determined. In other words, it is sufficient to determine the 12 components of ?? for every user ? to predict the preference vector ?? of user ? among more than one million item types.