Friday, January 31, 2014

Week 4 Reading Notes: Matching Models: Boolean and Space Vector

1.       Boolean retrieval model
(1)     Posting list intersection: merging; Query optimization
(2)     Extended Boolean model: term proximity; Additional information needed: spelling mistake tolerance, compound or phrases, term frequency, rank(document score)

2.       Vector Space Model
(1)     Parametric and zone indexes: index and retrieve documents by metadata; simple means for scoring in response to a query.
(2)     Term frequency and weighting: based on the statistics of occurrence of the term.
(3)     The vector space model for scoring: viewing each document as a vector of such weights, we can compute a score between a query and each document

(4)     Variant tf-idf functions: several variants of term-weighting

No comments:

Post a Comment