lemur::parse::ArabicParser Class Reference

#include <ArabicParser.hpp>

Inheritance diagram for lemur::parse::ArabicParser:

lemur::api::Parser lemur::api::TextHandler

Public Member Functions

 ArabicParser ()
void parseFile (const string &filename)
 Parse a file.

void parseBuffer (char *buf, int len)
 Parse a buffer of len length.

long fileTell () const

Static Public Attributes

const string identifier = "arabic"

Private Member Functions

void doParse ()
 Actual parsing action flow.

Private Attributes

int state
 The state of the parser.

Detailed Description

Parses arabic documents in NIST's TREC format, windows CP1256 encoding.

The following fields are parsed: TEXT, HL, HEAD, HEADLINE, LP, TTL, HEADER, FOOTER.

Constructor & Destructor Documentation

lemur::parse::ArabicParser::ArabicParser  ) 

Member Function Documentation

void lemur::parse::ArabicParser::doParse  )  [private]

Actual parsing action flow.

long lemur::parse::ArabicParser::fileTell  )  const [virtual]

Gives current byte position offset into file being parsed. Don't use with parseBuffer

Implements lemur::api::Parser.

void lemur::parse::ArabicParser::parseBuffer char *  buf,
int  len

Parse a buffer of len length.

Implements lemur::api::Parser.

void lemur::parse::ArabicParser::parseFile const string &  filename  )  [virtual]

Parse a file.

Implements lemur::api::Parser.

Member Data Documentation

const string lemur::parse::ArabicParser::identifier = "arabic" [static]

Reimplemented from lemur::api::Parser.

int lemur::parse::ArabicParser::state [private]

The state of the parser.

The documentation for this class was generated from the following files:
