Main Page | Namespace List | Class Hierarchy | Class List | File List | Namespace Members | Class Members | File Members | Related Pages

lemur::parse::KeyfileDocMgr Class Reference

#include <KeyfileDocMgr.hpp>

Inheritance diagram for lemur::parse::KeyfileDocMgr:

lemur::api::DocumentManager lemur::api::TextHandler lemur::parse::ElemDocMgr List of all members.

Public Member Functions

 KeyfileDocMgr ()
 default constructor

 KeyfileDocMgr (const string &name, bool readOnly=true)
 KeyfileDocMgr (string name, string mode, string source)
virtual ~KeyfileDocMgr ()
char * getDoc (const string &docID) const
 return the document associated with this ID

virtual char * handleDoc (char *docno)
 add entry for new doc

virtual void handleEndDoc ()
 finish entry for current doc

virtual char * handleWord (char *word)
 Add start and end byte offsets for this term to the list of offsets.

virtual void setParser (lemur::api::Parser *p)
 set myParser to p

virtual lemur::api::ParsergetParser () const
 returns a handle to a Parser object that can handle parsing the raw format of these documents

virtual void buildMgr ()
virtual const string & getMyID () const
 return name of this document manager, with the file extension (.bdm).

vector< MatchgetOffsets (const string &docID) const
virtual bool open (const string &manname)
 Open and load the toc file manname.


Protected Member Functions

virtual void writeTOC ()
virtual bool loadTOC ()
bool loadFTFiles (const string &fn, int num)

Protected Attributes

lemur::api::Parsermyparser
vector< Matchoffsets
int numdocs
string pm
lemur::file::Keyfile poslookup
lemur::file::Keyfile doclookup
int dbcache
btl docEntry
char * myDoc
int doclen
string IDname
string IDnameext
vector< string > sources
int numOldSources
 how many sources already processed?

int fileid
bool ignoreDoc
 are we ignoring this document?

bool _readOnly
 are we read only.


Detailed Description

Document manager using Keyfile for data storage. In addition to providing access to raw document text, also stores byte offsets (start and end byte) for each token within the document. Useful for passage windows or using query term match information for highlighting. Implements TextHandler interface for building the manager.


Constructor & Destructor Documentation

lemur::parse::KeyfileDocMgr::KeyfileDocMgr  )  [inline]
 

default constructor

lemur::parse::KeyfileDocMgr::KeyfileDocMgr const string &  name,
bool  readOnly = true
 

constructor (for open) name = toc file for this manager (same as getMyID)

lemur::parse::KeyfileDocMgr::KeyfileDocMgr string  name,
string  mode,
string  source
 

constructor (for build) name = what to name this manager mode = type of parser to use source = file with list of files this will manage

lemur::parse::KeyfileDocMgr::~KeyfileDocMgr  )  [virtual]
 


Member Function Documentation

void lemur::parse::KeyfileDocMgr::buildMgr  )  [virtual]
 

Build the document manager tables from the files previously provided in the constructor.

Implements lemur::api::DocumentManager.

char * lemur::parse::KeyfileDocMgr::getDoc const string &  docID  )  const [virtual]
 

return the document associated with this ID

Implements lemur::api::DocumentManager.

virtual const string& lemur::parse::KeyfileDocMgr::getMyID  )  const [inline, virtual]
 

return name of this document manager, with the file extension (.bdm).

Implements lemur::api::DocumentManager.

vector< lemur::parse::Match > lemur::parse::KeyfileDocMgr::getOffsets const string &  docID  )  const
 

get the array of Match entries for the tokens in the document named docID. The entries are indexed by token position (as is recorded in a TermInfoList object.

virtual lemur::api::Parser* lemur::parse::KeyfileDocMgr::getParser  )  const [inline, virtual]
 

returns a handle to a Parser object that can handle parsing the raw format of these documents

Implements lemur::api::DocumentManager.

char * lemur::parse::KeyfileDocMgr::handleDoc char *  docno  )  [virtual]
 

add entry for new doc

Reimplemented from lemur::api::TextHandler.

void lemur::parse::KeyfileDocMgr::handleEndDoc  )  [virtual]
 

finish entry for current doc

Reimplemented from lemur::api::TextHandler.

virtual char* lemur::parse::KeyfileDocMgr::handleWord char *  word  )  [inline, virtual]
 

Add start and end byte offsets for this term to the list of offsets.

Reimplemented from lemur::api::TextHandler.

bool lemur::parse::KeyfileDocMgr::loadFTFiles const string &  fn,
int  num
[protected]
 

bool lemur::parse::KeyfileDocMgr::loadTOC  )  [protected, virtual]
 

Reimplemented in lemur::parse::ElemDocMgr.

virtual bool lemur::parse::KeyfileDocMgr::open const string &  manname  )  [inline, virtual]
 

Open and load the toc file manname.

Implements lemur::api::DocumentManager.

Reimplemented in lemur::parse::ElemDocMgr.

virtual void lemur::parse::KeyfileDocMgr::setParser lemur::api::Parser p  )  [inline, virtual]
 

set myParser to p

void lemur::parse::KeyfileDocMgr::writeTOC  )  [protected, virtual]
 

Reimplemented in lemur::parse::ElemDocMgr.


Member Data Documentation

bool lemur::parse::KeyfileDocMgr::_readOnly [protected]
 

are we read only.

int lemur::parse::KeyfileDocMgr::dbcache [protected]
 

btl lemur::parse::KeyfileDocMgr::docEntry [protected]
 

int lemur::parse::KeyfileDocMgr::doclen [protected]
 

lemur::file::Keyfile lemur::parse::KeyfileDocMgr::doclookup [mutable, protected]
 

int lemur::parse::KeyfileDocMgr::fileid [protected]
 

string lemur::parse::KeyfileDocMgr::IDname [protected]
 

string lemur::parse::KeyfileDocMgr::IDnameext [protected]
 

bool lemur::parse::KeyfileDocMgr::ignoreDoc [protected]
 

are we ignoring this document?

char* lemur::parse::KeyfileDocMgr::myDoc [protected]
 

lemur::api::Parser* lemur::parse::KeyfileDocMgr::myparser [protected]
 

int lemur::parse::KeyfileDocMgr::numdocs [protected]
 

int lemur::parse::KeyfileDocMgr::numOldSources [protected]
 

how many sources already processed?

vector<Match> lemur::parse::KeyfileDocMgr::offsets [mutable, protected]
 

string lemur::parse::KeyfileDocMgr::pm [protected]
 

lemur::file::Keyfile lemur::parse::KeyfileDocMgr::poslookup [mutable, protected]
 

vector<string> lemur::parse::KeyfileDocMgr::sources [protected]
 


The documentation for this class was generated from the following files:
Generated on Tue Jun 15 11:03:06 2010 for Lemur by doxygen 1.3.4