Main Page | Namespace List | Class Hierarchy | Class List | File List | Namespace Members | Class Members | File Members | Related Pages

Cross Lingual Retrieval Evaluation

This application runs cross-lingual retrieval experiments.

Parameters are:

  1. sourceIndex: The complete name of the index for the source language collection. This provides the background model for the source language.
  2. targetIndex: The complete name of the index for the target language collection. This is the collection that is searched.

  3. textQuery: the query text stream, in the source language
  4. XLlambda: The smoothing parameter for mixing P(t|D) and P(s|GS).
  5. XLbeta: The Jelinik-Mercer lambda for estimating P(t|D).

  6. sourceBackgroundModel: One of "term" or "doc". If term, background model for the source language is estimated as tf(s)/|V|. If doc, the background model for the source language is estimated as df(t)/sum_w_in_V df(w). Default is term.

  7. targetBackgroundModel: One of "term" or "doc". If term, background model for the target language is estimated as tf(s)/|V|. If doc, the background model for the target language is estimated as df(t)/sum_w_in_V df(w). Default is term.

  8. resultFile: the result file
  9. resultFormat: whether the result format should be of the TREC format (i.e., six-column) or just a simple three-column format <queryID, docID, score>. String value, either trec for TREC format or 3col for three column format. Default: TREC format.
  10. resultCount: the number of documents to return for each query

  11. feedbackDocCount: the number of docs to use for pseudo-feedback (0 means no-feedback)
  12. feedbackTermCount: the number of terms to add to a query when doing feedback.

Simple KL parameters:

  1. smoothSupportFile: The name of the smoothing support file
  2. smoothMethod: One of the four:

  3. smoothStrategy: Either interpolate for interpolate or backoff for backoff.

  4. adjustedScoreMethod: Which type of score to output, one of:

  5. JelinekMercerLambda: The collection model weight in the JM interpolation method. Default: 0.5

  6. DirichletPrior: The prior parameter in the Dirichlet prior smoothing method. Default: 1000

  7. discountDelta: The delta (discounting constant) in the absolute discounting method. Default 0.7.
  8. queryUpdateMethod: feedback method, one of:
  1. feedbackCoefficient: the coefficient of the feedback model for interpolation. The value is in [0,1], with 0 meaning using only the original model (thus no updating/feedback) and 1 meaning using only the feedback model (thus ignoring the original model).

  2. feedbackTermCount: Truncate the feedback model to no more than a given number of words/terms.

  3. feedbackProbThresh: Truncate the feedback model to include only words with a probability higher than this threshold. Default value: 0.001.

  4. feedbackProbSumThresh: Truncate the feedback model until the sum of the probability of the included words reaches this threshold. Default value: 1.
Parameters feedbackTermCount, feedbackProbThresh, and feedbackProbSumThresh work conjunctively to control the truncation, i.e., the truncated model must satisfy all the three constraints.
Generated on Tue Jun 15 11:02:58 2010 for Lemur by doxygen 1.3.4