Saturday, April 19, 2014

Week 14 Reading Notes: New fronts in information retrieval

Traditional adaptive ltering systems learn the user's interests in a rather simple way { words from relevant
documents are favored in the query model, while words from irrelevant documents are down-weighted. This biases the query model towards speci c words seen in the past, causing the system to favor documents containing relevant but redundant information over documents that use previously unseen words to denote new facts about the same news event. This paper proposes news ways of generalizing from relevance feedback by augmenting the traditional bag-of-words query model with named entity wildcards that
are anchored in context. The use of wildcards allows generalization beyond speci c words, while contextual
restrictions limit the wildcard-matching to entities related to the user's query. We test our new approach in a nugget-level adaptive ltering system and evaluate it in terms of both relevance and novelty of the presented information. Our results indicate that higher recall is obtained when lexical terms are generalized using wildcards. However, such wildcards must be anchored to their context to maintain good precision. How the context of a wildcard is represented and matched against a given document also plays a crucial role in the performance of the retrieval system.

No comments:

Post a Comment