Main Page | Namespace List | Class Hierarchy | Class List | File List | Namespace Members | Class Members | File Members | Related Pages

PageRank application

PageRank parameters (pagerank)

corpus
The pathname of the file or directory containing documents to index. Specified as <corpus>/path/to/file_or_directory</corpus> in the parameter file and as -corpus=/path/to/file_or_directory on the command line.
links
The pathname of the directory containing sorted links data for the documents specified in corpus produced by the harvestlinks program. Specified as<links>/path/to/links</links> in the parameter file and as -links=/path/to/links on the command line.

output
basename for the output files.
index
index to use to get the collection size and internal document ids. Default is none. When none the corpus is scanned to count the number of documents and the string document ids are used.
docs
Number of documents to process per iteration. Default 1000. This parameter is ignored if an index parameter is provided, all docs will be used for each iteration.

iters
Number of iterations to use estimating the PageRank. Default is 10 if no index parameter is provided, otherwise 100.

c
Dampening parameter. Default 0.5 if no index parameter is provided, otherwise 0.85

writeRaw
Write the raw PageRank scores to <output>.raw
writeRanks
Write the integer PageRank scores [1..10] to <output>.ranks
writePriors
Write the log probability PageRank scores to <output>.priors. This data file is suitable for input to the makeprior application.

Generated on Tue Jun 15 11:02:58 2010 for Lemur by doxygen 1.3.4