smoothSupportFile, see below) is needed by retrieval using smoothed unigram language model. Each entry in this support file corresponds to one document and records two pieces of information: (a) the count of unique terms in the document; (b) the sum of collection language model probabilities for the words in the document. The other file (with an extra suffix "<tt>.mc</tt>" is needed if you run feedback based on the Markov chain query model. Each line in this file contains a term and a sum of the probability of the word given all documents in the collection. (i.e., a sum of
p(w|d)over all possible
To run the application, follow the general steps of running a lemur application and set the following variables in the parameter file:
index: the table-of-content (TOC) record file of the index.
smoothSupportFile: file path for the support file (e.g.,