Main Page | Namespace List | Class Hierarchy | Class List | File List | Namespace Members | Class Members | File Members | Related Pages

lemur::retrieval::SimpleKLQueryModel Class Reference

Query model representation for the simple KL divergence model. More...

#include <SimpleKLRetMethod.hpp>

Inheritance diagram for lemur::retrieval::SimpleKLQueryModel:

lemur::retrieval::ArrayQueryRep lemur::api::TextQueryRep lemur::api::QueryRep List of all members.

Public Member Functions

 SimpleKLQueryModel (const lemur::api::TermQuery &qry, const lemur::api::Index &dbIndex)
 construct a query model based on query text

 SimpleKLQueryModel (const lemur::api::Index &dbIndex)
 construct an empty query model

virtual ~SimpleKLQueryModel ()
virtual void interpolateWith (const lemur::langmod::UnigramLM &qModel, double origModCoeff, int howManyWord, double prSumThresh=1, double prThresh=0)
 interpolate the model with any (truncated) unigram LM, default parameter to control the truncation is the number of words

virtual double scoreConstant () const
 Any query-specific constant term in the scoring formula.

virtual void load (istream &is)
 load a query model/rep from input stream is

virtual void save (ostream &os)
 save a query model/rep to output stream os

virtual void clarity (ostream &os)
 save a query clarity to output stream os

virtual double clarity () const
 compute query clarity score

double colDivergence () const
 get and compute if necessary query-collection KL-div (useful for recovering the true divergence value from a score)

double KLDivergence (const lemur::langmod::UnigramLM &refMod)
 compute the KL-div of the query model and any unigram LM, i.e.,D(Mq|Mref)

double colQueryLikelihood () const

Protected Attributes

double colQLikelihood
double colKL
bool colKLComputed
lemur::api::IndexedRealVectorqm
const lemur::api::Indexind

Detailed Description

Query model representation for the simple KL divergence model.


Constructor & Destructor Documentation

lemur::retrieval::SimpleKLQueryModel::SimpleKLQueryModel const lemur::api::TermQuery qry,
const lemur::api::Index dbIndex
[inline]
 

construct a query model based on query text

lemur::retrieval::SimpleKLQueryModel::SimpleKLQueryModel const lemur::api::Index dbIndex  )  [inline]
 

construct an empty query model

virtual lemur::retrieval::SimpleKLQueryModel::~SimpleKLQueryModel  )  [inline, virtual]
 


Member Function Documentation

double lemur::retrieval::SimpleKLQueryModel::clarity  )  const [virtual]
 

compute query clarity score

void lemur::retrieval::SimpleKLQueryModel::clarity ostream &  os  )  [virtual]
 

save a query clarity to output stream os

double lemur::retrieval::SimpleKLQueryModel::colDivergence  )  const [inline]
 

get and compute if necessary query-collection KL-div (useful for recovering the true divergence value from a score)

double lemur::retrieval::SimpleKLQueryModel::colQueryLikelihood  )  const [inline]
 

void lemur::retrieval::SimpleKLQueryModel::interpolateWith const lemur::langmod::UnigramLM qModel,
double  origModCoeff,
int  howManyWord,
double  prSumThresh = 1,
double  prThresh = 0
[virtual]
 

interpolate the model with any (truncated) unigram LM, default parameter to control the truncation is the number of words

The interpolated model is defined as origModCoeff*p(w|original_model)+(1-origModCoeff*p(w|new_truncated_model).

The "new truncated model" gives a positive probability to all words that "survive" in the truncating process, but gives a zero probability to all others. So, the sum of all word probabilities according to the truncated model does not have to sum to 1. The assumption is that if a word has an extrememly small probability, adding it to the query model will not affect scoring that much.

The truncation procedure is as follows: First, we sort the probabilities in qModel passed in, and then iterate over all the entries. For each entry, we check the stopping condition and add the entry to the existing query model if none of the following stopping conditions is satisfied. If, however, any of the conditions is satisfied, the process will terminate. The three stopping conditions are: (1) We already added howManyWord words. (2) The total sum of probabilities added exceeds the threshold prSumThresh. (3) The probability of the current word is below prThresh.

double lemur::retrieval::SimpleKLQueryModel::KLDivergence const lemur::langmod::UnigramLM refMod  )  [inline]
 

compute the KL-div of the query model and any unigram LM, i.e.,D(Mq|Mref)

void lemur::retrieval::SimpleKLQueryModel::load istream &  is  )  [virtual]
 

load a query model/rep from input stream is

void lemur::retrieval::SimpleKLQueryModel::save ostream &  os  )  [virtual]
 

save a query model/rep to output stream os

virtual double lemur::retrieval::SimpleKLQueryModel::scoreConstant  )  const [inline, virtual]
 

Any query-specific constant term in the scoring formula.

Reimplemented from lemur::retrieval::ArrayQueryRep.


Member Data Documentation

double lemur::retrieval::SimpleKLQueryModel::colKL [mutable, protected]
 

bool lemur::retrieval::SimpleKLQueryModel::colKLComputed [mutable, protected]
 

double lemur::retrieval::SimpleKLQueryModel::colQLikelihood [mutable, protected]
 

const lemur::api::Index& lemur::retrieval::SimpleKLQueryModel::ind [protected]
 

lemur::api::IndexedRealVector* lemur::retrieval::SimpleKLQueryModel::qm [protected]
 


The documentation for this class was generated from the following files:
Generated on Tue Jun 15 11:03:06 2010 for Lemur by doxygen 1.3.4