Web Graph

Here are the web graphs for the entire ClueWeb09 dataset as well as for the TREC 2009 Category B (first 50 million English pages) subset.
If they are not included in the hard disks sent for dataset distribution, they can be downloaded from the links here provided.

The statistics for the web graph are as follows:

Webgraph Format

Full Dataset

The webgraph for the full dataset consists of the following files:

TREC Category B

The webgraph for the TREC Category B files (first 50 million English pages) consists of the following files: