Main Page | Namespace List | Class Hierarchy | Class List | File List | Namespace Members | Class Members | File Members | Related Pages

lemurproject::indri::IndexEnvironment Class Reference

List of all members.

Public Member Functions

synchronized void delete ()
 IndexEnvironment ()
void setDocumentRoot (String documentRoot) throws java.lang.Exception
void setAnchorTextPath (String anchorTextRoot) throws java.lang.Exception
void setOffsetMetadataPath (String offsetMetadataRoot) throws java.lang.Exception
void setOffsetAnnotationsPath (String offsetAnnotationsRoot) throws java.lang.Exception
void addFileClass (String name, String iterator, String parser, String tokenizer, String startDocTag, String endDocTag, String endMetadataTag, String[] include, String[] exclude, String[] index, String[] metadata, Map conflations) throws java.lang.Exception
Specification getFileClassSpec (String name) throws java.lang.Exception
void addFileClass (Specification spec) throws java.lang.Exception
void deleteDocument (int documentID) throws java.lang.Exception
void setIndexedFields (String[] fieldNames) throws java.lang.Exception
void setNumericField (String fieldName, boolean isNumeric, String parserName) throws java.lang.Exception
void setNumericField (String fieldName, boolean isNumeric) throws java.lang.Exception
void setOrdinalField (String fieldName, boolean isOrdinal) throws java.lang.Exception
void setParentalField (String fieldName, boolean isParental) throws java.lang.Exception
void setMetadataIndexedFields (String[] forward, String[] backward) throws java.lang.Exception
void setStopwords (String[] stopwords) throws java.lang.Exception
void setStemmer (String stemmer) throws java.lang.Exception
void setMemory (long memory) throws java.lang.Exception
void setNormalization (boolean normalize) throws java.lang.Exception
void setStoreDocs (boolean flag) throws java.lang.Exception
void create (String repositoryPath, IndexStatus callback) throws java.lang.Exception
void create (String repositoryPath) throws java.lang.Exception
void open (String repositoryPath, IndexStatus callback) throws java.lang.Exception
void open (String repositoryPath) throws java.lang.Exception
void close () throws java.lang.Exception
void addFile (String fileName) throws java.lang.Exception
void addFile (String fileName, String fileClass) throws java.lang.Exception
int addString (String fileName, String fileClass, Map metadata) throws java.lang.Exception
int addString (String documentString, String fileClass, Map metadata, TagExtent[] tags) throws java.lang.Exception
int addParsedDocument (ParsedDocument document) throws java.lang.Exception
int documentsIndexed () throws java.lang.Exception
int documentsSeen () throws java.lang.Exception

Protected Member Functions

 IndexEnvironment (long cPtr, boolean cMemoryOwn)
void finalize ()

Static Protected Member Functions

long getCPtr (IndexEnvironment obj)

Protected Attributes

boolean swigCMemOwn

Private Attributes

long swigCPtr

Constructor & Destructor Documentation

lemurproject::indri::IndexEnvironment::IndexEnvironment long  cPtr,
boolean  cMemoryOwn
[inline, protected]
 

lemurproject::indri::IndexEnvironment::IndexEnvironment  )  [inline]
 


Member Function Documentation

void lemurproject::indri::IndexEnvironment::addFile String  fileName,
String  fileClass
throws java.lang.Exception [inline]
 

add a file of the specified file class to the index and repository

Parameters:
fileName the file to add
fileClass the file class to add (eg trecweb).
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::addFile String  fileName  )  throws java.lang.Exception [inline]
 

Add the text in a file to the index and repository. The fileClass of this file will be chosen based on the file extension. If the file has no extension, it will be skipped. Information about indexing progress will be passed to the callback.

Parameters:
fileName the file to add
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::addFileClass Specification  spec  )  throws java.lang.Exception [inline]
 

Add a file class.

Parameters:
spec The file class to add.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::addFileClass String  name,
String  iterator,
String  parser,
String  tokenizer,
String  startDocTag,
String  endDocTag,
String  endMetadataTag,
String[]  include,
String[]  exclude,
String[]  index,
String[]  metadata,
Map  conflations
throws java.lang.Exception [inline]
 

Add parsing information for a file class. Data for these parameters is passed into the FileClassEnvironmentFactory

Parameters:
name name of this file class, eg trecweb
iterator document iterator for this file class
parser document parser for this file class
tokenizer document tokenizer for this file class
startDocTag tag indicating start of a document
endDocTag tag indicating the end of a document
endMetadataTag tag indicating the end of the metadata fields
include default tags whose contents should be included in the index
exclude tags whose contents should be excluded from the index
index tags that should be forwarded to the index for tag extents
metadata tags whose contents should be indexed as metadata
conflations tags that should be conflated
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

int lemurproject::indri::IndexEnvironment::addParsedDocument ParsedDocument  document  )  throws java.lang.Exception [inline]
 

add an already parsed document to the index and repository

Parameters:
document the document to add
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

int lemurproject::indri::IndexEnvironment::addString String  documentString,
String  fileClass,
Map  metadata,
TagExtent[]  tags
throws java.lang.Exception [inline]
 

Adds a string to the index and repository. The documentString is assumed to contain the kind of text that would be found in a file of type fileClass.

Parameters:
documentString the document string to add.
fileClass the file class, signaling which parser to use while processing the document string.
metadata a map of metadata String name to String value.
tags offset annotations to be indexed as field data. The begin and end values of each TagExtent specify byte (not character or token) offsets within the document string. These byte offsets are converted to token offsets after document string parsing.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

int lemurproject::indri::IndexEnvironment::addString String  fileName,
String  fileClass,
Map  metadata
throws java.lang.Exception [inline]
 

Adds a string to the index and repository. The documentString is assumed to contain the kind of text that would be found in a file of type fileClass.

Parameters:
fileName the document to add
fileClass the file class to add (eg trecweb).
metadata the metadata pairs associated with the string.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::close  )  throws java.lang.Exception [inline]
 

close the index and repository

Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::create String  repositoryPath  )  throws java.lang.Exception [inline]
 

create a new index and repository

Parameters:
repositoryPath the path to the repository
callback IndexStatus object to be notified of indexing progress.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::create String  repositoryPath,
IndexStatus  callback
throws java.lang.Exception [inline]
 

create a new index and repository

Parameters:
repositoryPath the path to the repository
callback IndexStatus object to be notified of indexing progress.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

synchronized void lemurproject::indri::IndexEnvironment::delete  )  [inline]
 

void lemurproject::indri::IndexEnvironment::deleteDocument int  documentID  )  throws java.lang.Exception [inline]
 

Delete an existing document.

Parameters:
documentID The document to delete.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

int lemurproject::indri::IndexEnvironment::documentsIndexed  )  throws java.lang.Exception [inline]
 

Returns the number of documents indexed so far in this session.

Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

int lemurproject::indri::IndexEnvironment::documentsSeen  )  throws java.lang.Exception [inline]
 

Returns the number of documents considered for indexing, which is the sum of the documents indexed and the documents skipped.

Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::finalize  )  [inline, protected]
 

long lemurproject::indri::IndexEnvironment::getCPtr IndexEnvironment  obj  )  [inline, static, protected]
 

Specification lemurproject::indri::IndexEnvironment::getFileClassSpec String  name  )  throws java.lang.Exception [inline]
 

Get a named file class.

Parameters:
name The name of the file class to retrieve.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::open String  repositoryPath  )  throws java.lang.Exception [inline]
 

open an existing index and repository

Parameters:
repositoryPath the path to the repository
callback IndexStatus object to be notified of indexing progress.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::open String  repositoryPath,
IndexStatus  callback
throws java.lang.Exception [inline]
 

open an existing index and repository

Parameters:
repositoryPath the path to the repository
callback IndexStatus object to be notified of indexing progress.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::setAnchorTextPath String  anchorTextRoot  )  throws java.lang.Exception [inline]
 

Set anchor text root path.

Parameters:
anchorTextRoot path to anchor text root.
Exceptions:
Exception if

void lemurproject::indri::IndexEnvironment::setDocumentRoot String  documentRoot  )  throws java.lang.Exception [inline]
 

Set the document root path

Parameters:
documentRoot path to document root.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::setIndexedFields String[]  fieldNames  )  throws java.lang.Exception [inline]
 

Set names of fields to be indexed. This call indicates to the index that information about these fields should be stored in the index so they can be used in queries. This does not affect whether or not the text in a particular field is stored in an index.

Parameters:
fieldNames the list of fields.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::setMemory long  memory  )  throws java.lang.Exception [inline]
 

set the amount of memory to use for internal structures

Parameters:
memory the number of bytes to use.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::setMetadataIndexedFields String[]  forward,
String[]  backward
throws java.lang.Exception [inline]
 

Set names of metadata fields to be indexed for fast retrieval. The forward fields are indexed in a B-Tree mapping (documentID, metadataValue). If a field is not forward indexed, the documentMetadata calls will still work, but they will be slower (the document has to be retrieved, decompressed and parsed to get the metadata back, instead of just a B-Tree lookup). The backward indexed fields store a mapping of (metadataValue, documentID). If a field is not backward indexed, the documentIDsFromMetadata and documentFromMetadata calls will not work.

Parameters:
forward the list of fields to forward index.
backward the list of fields to backward index.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::setNormalization boolean  normalize  )  throws java.lang.Exception [inline]
 

set normalization of case and some punctuation; default is true (normalize during indexing and at query time)

Parameters:
normalize True, if text should be normalized, false otherwise.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::setNumericField String  fieldName,
boolean  isNumeric
throws java.lang.Exception [inline]
 

Set the numeric property of a field.

Parameters:
fieldName the field.
isNumeric true if the field is a numeric field, false if not.
parserName The name of the Transformation to use to compute the numeric value of the field. Repository currently recognizes the name NumericFieldAnnotator.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::setNumericField String  fieldName,
boolean  isNumeric,
String  parserName
throws java.lang.Exception [inline]
 

Set the numeric property of a field.

Parameters:
fieldName the field.
isNumeric true if the field is a numeric field, false if not.
parserName The name of the Transformation to use to compute the numeric value of the field. Repository currently recognizes the name NumericFieldAnnotator.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::setOffsetAnnotationsPath String  offsetAnnotationsRoot  )  throws java.lang.Exception [inline]
 

Set offset annotations root path.

Parameters:
offsetAnnotationsRoot path to offset annotations root.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::setOffsetMetadataPath String  offsetMetadataRoot  )  throws java.lang.Exception [inline]
 

Set offset metadata root path.

Parameters:
offsetMetadataRoot path to offset metadata root.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::setOrdinalField String  fieldName,
boolean  isOrdinal
throws java.lang.Exception [inline]
 

Set the ordinal property of a field. If child, parent, or ancestor field queries are slow, you may want to be certain to index the specified fields explicitly as an ordinal. This speeds things up at the cost of a minimal amount of disk space.

Parameters:
fieldName the field.
isOrdinal true if the field is an ordinal field, false if not.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::setParentalField String  fieldName,
boolean  isParental
throws java.lang.Exception [inline]
 

Set the parental property of a field. If child, parent, or ancestor field queries are slow, you may want to be certain to index the specified fields explicitly as an ordinal. This speeds things up at the cost of a minimal amount of disk space.

Parameters:
fieldName the field.
isParental true if the field is a parental field, false if not.
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::setStemmer String  stemmer  )  throws java.lang.Exception [inline]
 

set the stemmer to use

Parameters:
stemmer the stemmer to use. One of krovetz, porter
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::setStopwords String[]  stopwords  )  throws java.lang.Exception [inline]
 

set the list of stopwords

Parameters:
stopwords the list of stopwords
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.

void lemurproject::indri::IndexEnvironment::setStoreDocs boolean  flag  )  throws java.lang.Exception [inline]
 

set the storeDocs flag

Parameters:
flag,false to not store documents in the compressed collection, true to do so (default)
Exceptions:
Exception if a lemur::api::Exception was thrown by the JNI library.


Member Data Documentation

boolean lemurproject::indri::IndexEnvironment::swigCMemOwn [protected]
 

long lemurproject::indri::IndexEnvironment::swigCPtr [private]
 


The documentation for this class was generated from the following file:
Generated on Tue Jun 15 11:03:07 2010 for Lemur by doxygen 1.3.4