Main Page | Namespace List | Class Hierarchy | Class List | File List | Namespace Members | Class Members | File Members | Related Pages

PageRank application

PageRank parameters (pagerank)

The pathname of the file or directory containing documents to index. Specified as <corpus>/path/to/file_or_directory</corpus> in the parameter file and as -corpus=/path/to/file_or_directory on the command line.
The pathname of the directory containing sorted links data for the documents specified in corpus produced by the harvestlinks program. Specified as<links>/path/to/links</links> in the parameter file and as -links=/path/to/links on the command line.

basename for the output files.
index to use to get the collection size and internal document ids. Default is none. When none the corpus is scanned to count the number of documents and the string document ids are used.
Number of documents to process per iteration. Default 1000. This parameter is ignored if an index parameter is provided, all docs will be used for each iteration.

Number of iterations to use estimating the PageRank. Default is 10 if no index parameter is provided, otherwise 100.

Dampening parameter. Default 0.5 if no index parameter is provided, otherwise 0.85

Write the raw PageRank scores to <output>.raw
Write the integer PageRank scores [1..10] to <output>.ranks
Write the log probability PageRank scores to <output>.priors. This data file is suitable for input to the makeprior application.

Generated on Tue Jun 15 11:02:58 2010 for Lemur by doxygen 1.3.4