News
The Lemur Toolkit is transitioning from lemurproject.org to SourceForge.net for various project services. Visit The Lemur Toolkit to see project's pages. The first item we will transition is the announcements mailing list. Subsequently we will transition the discussion forum, CVS for the Lemur Toolkit and Indri, and the tracking system for bugs, feature requests, and support.
Bug Submissions, Feature Requests and Support
As part of the migration of portions of the Lemur Toolkit to SourceForge, we have recently opened up bug tracking, feature requests, and support requests so that you can directly submit these items to us, the developers.
Browsing of bugs, feature requests and support requests are open to anyone, but if you wish to add a bug, feature request, or support request, you need to have an account on SourceForge. If you need to create an account, you can create one here.
- Submit a bug or browse the list of open bugs
- Submit a new feature request or browse the current list of feature requests
- Submit a new request for support or browse the current support requests
Release Schedule
- 6/23/2008: Lemur Toolkit version 4.7 and Indri version 2.7 released!
Current Release Notes:
- 4.7 corrects various issues in the 4.6 distribution package, adds several new utilities. See the release notes for complete details.
- Applications compiled with the Lemur Toolkit require the following libraries: z, iberty, pthread, and m on linux, and additionally socket and nsl on solaris. Applications built in Visual Studio require the additional library wsock32.lib. The java jar files were built with Java 5 (jdk 1.5.0). The java UIs require Java 5. We have tested using GCC 3.2 (solaris), 3.2.2(linux), 3.4(linux), 3.4.3(linux x86_64), 4.0.2(linux), VC++ .NET 7.1(Windows XP), and Visual Studio 2005 (Windows XP).
Known Problems:
This is a list of bugs and known problems with the current version of Lemur (4.7) and Indri (2.7). Many problems have fixes or workarounds that are posted on the Lemur Forums. There may also be open bug tickets issued on sourceforge, see https://sourceforge.net/tracker/?group_id=161383&atid=819615 for the complete tickets. Please check there if you do not see something here.
- No known problems currently exist.
Enhancements (4.7):
- ireval: Implemented in java, ireval provides a command line interface, similar to trec_eval. In addition to all of the metrics provided by trec_eval, ireval also computes Normalized Distributed Cumulative Gain (NDCG). Additionally, ireval can be used to compare the performance of a pair of system outputs, providing the paired T test, Wilcoxon's Sign test, and randomization test for statistical significance testing. Ireval's output has been validated against trec_eval v8.1. A future version of ireval will include a graphical user interface and report generation.
- PageRank: Computes floating point raw scores, binned integer page ranks, and prior probabilities suitable for installation in an Indri repository via the makeprior application.
- Harvestlinks: Version 4.7 includes an updated version of the Harvestlinks utility. The updated version utilizes the already present keyfile code that is actively maintained within the Lemur Toolkit as well as streamlining the link extraction and sorting process. The new version also provides speed improvements especially when harvesting links from large data sets. The rewritten code and standardized classes used with the new version will also make it easier to maintain the code for future optimizations.
- Query log toolbar: The Lemur Query Log Toolbar is a FireFox add-on that monitors a variety of user actions, collects the data and allows the aggregation of these logs through a Query Log Server. On the client-side, a set of configuration options enable researchers and users to specify toolbar behavior. Several privacy filters are in place that enable users to specify information that will not be collected and/or shared. A version of the toolbar for Internet Explorer is planned for the December 2008 release.
- Query Log Toolbar Server: On the server-side, the Lemur transaction database, a set of Java servlets allow the aggregation of the user log data into a MySQL database. The servlets, hosted via a servlet container such as Apache's Tomcat, also includes several configuration options for specifying the database connection and the levels of allowed privacy on the client toolbar side. Also included is an application to run a server if you do not have a servlet container.
- Indri SOAP server: Provides a web service that allows clients to add a document to an index, delete a document from an index, retrieve document vectors from an index, and query an index in a language-independent fashion. This enables building user interfaces that access indri indexes in the developer's language of choice, such as Python, PHP 5, Ruby, C#, or Java.
- Relevance judgment UI: To help provide support for creating evaluation relevance data for queries and datasets, we have augmented the Java-based retrieval UI for the Lemur Toolkit to allow a user to specify scores for query results. The various queries, scores and resulting document IDs can then be exported to a qrel file on disk for use with the standard TRECEval utility. Furthermore, previous qrel judgments can be loaded in and manipulated by the program before being saved to disk.
- Mac OSX Binary Intel Image: Mac OSX Intel binary install disk image added for users who just want to run the Lemur Toolkit applications.
- Lemur CGI: Various query speedups, URL normalization and code tidying to Indri style searches.
The Lemur Project
Last modified:June 20, 2008. 13:06:10 pm

