Friday, March 21, 2014

Week 10 Reading Notes: Web information retrieval

1. Web search basics
(1) Background and history about the forces that conspire to make the Web chaotic, fast-changing and very different from the “traditional” collections.

(2) Estimating the number of documents indexed by web search engines, and the elimination of duplicate documents in web indexes, respectively.

2. Link analysis
The use of hyperlinks for ranking web search results
(1)     The use of web graph
(2)     Page rank: the page rank of a node will depend on the link structure of the web graph. Given a query, a web search engine computes a composite score for each web page that combines hundreds of features such as cosine similarity and term proximity, together with the Page Rank score.

(3)     Hyperlink-Induced Topic Search(HITS)

No comments:

Post a Comment