Xiao Bai: Toward Distributed Search

14.00, Room 455 at PCRI

The rapid increasing amount of data on the Web provides a huge source of information but makes efficient search more challenging. Distributing the search is appealing to improve both efficiency and scalability. In this talk, we first present, in the context of social tagging systems, two gossip-based approaches that personalize query processing in a peer-to-peer manner. The off-line approach relies on user’s past behavior to personalize the search, and the on-line approach relies on user’s past behavior and current query to further improve the result quality for queries depicting user’s emerging interests. We then present, for a multi-site search engine, two approaches that invalidate entries in result cache to guarantee the freshness of results served to users. The on-line invalidation approach invalidates an entry upon cache hit according to index changes in local search site. The threshold-based approach makes invalidations when index changes in remote search sites. Joint use of both approaches provides a promising solution to react to index updating that may arise in future multi-site search engines.

