Built-In Solr Index Replication with Solr 1.4

Replication has always been one of Solr’s cooler features, but its been hampered by the Unix features it employs. Unix scripts mixed in with run (almost) anywhere Java is enough to make anyone sigh. Users of Solr on Windows have been somewhat left out in the cold. That’s all changing though, because Solr 1.4 will bring a new, built-in, replication feature that works as a Solr RequestHandler.

The authors of the new RequestHandler have posted…

Read more...

SF Bay area Lucene/Solr Meetup

Wednesday, 3 June 2009
18:30 to 21:30

There is going to be a meetup in the SF Bay Area on Lucene and Solr on June 3. See http://www.meetup.com/SFBay-Lucene-Solr-Meetup/ for more details.

We’re working on getting a bigger room, so please put your name on the waiting list if you still want to attend.

Read more...

Filtered query performance increases for Solr 1.4

One of the many performance improvements in the upcoming Solr 1.4 release involves improved filtering performance. Solr 1.4 filters are both faster (anywhere from 30% to 80% faster to calculate intersections, depending on configuration), take less memory (40% smaller), and are more efficiently applied to the query during a search.

In previous Solr releases, filters were applied after the main query and thus had little impact on overall query performance. Filters are now checked in…

Read more...

Accessing words around a positional match in Lucene

From time to time, users on the Lucene mailing list ask a variant of the following question:

Given a term match in a document, what’s the best way to get a window of words around that match?

Getting a window of words around a match can be useful for a lot of things, including, to name a few:

  1. Highlighting (although I’d recommend using Lucene’s Highlighter package for that)
  2. Co-occurrence analysis
  3. Sentiment analysis
  4. Question Answering

Read more...

PyLucene 2.4.1 Released

Here’s the announcement from the PyLucene team:
This is a refresher release of Apache PyLucene 2.4.1 that addresses a few bugs and annoyances:

http://svn.apache.org/repos/asf/lucene/pylucene/tags/pylucene_2_4_1/CHANGES

http://svn.apache.org/repos/asf/lucene/pylucene/tags/pylucene_2_4_1/jcc/CHANGES

Apache PyLucene 2.4.1 is available from the following download page:
http://www.apache.org/dyn/closer.cgi/lucene/pylucene/pylucene-2.4.1-2-src.tar.gz

When downloading from a mirror site, please remember to verify the downloads using signatures found on the Apache site:
http://www.apache.org/dist/lucene/pylucene/KEYS

For more information on Apache PyLucene, visit the project home page:
http://lucene.apache.org/pylucene

Read more...

Lucid presents “Open Source Search Powered by Solr” at the upcoming Basis Technology Government Users Conference

Monday, 8 June 2009 to Tuesday, 9 June 2009

Basis Technology 2009 Government Users Conference

Lucid will be presenting “Open Source Search Powered by Solr” at the upcoming Basis Technology Government Users Conference.

Apache Solr, an open source search engine built on Lucene, powers search and findability in many enterprise and government sector applications. Solr features incredibly fast indexing and querying speeds, top-notch relevancy and scoring flexibility, faceting, spell checking, distributed search, and much more. Not only can Solr

Read more...

Exploring Lucene and Solr’s TrieRange Capabilities

Recently, Uwe Schindler and others have added a new capability to Lucene and Solr to make working with numeric ranges a lot faster.  I haven’t tried out this new functionality yet, so I thought I would walk through it here and explore it’s capabilities.

Since Lucene treats most everything as Strings, encoding  numbers and dates and then utilizing them in ranges has always required a little extra work to make it perform well.  Previously, one…

Read more...

Lucene/Solr Meetup / May 20th, Reston VA, 6-8:30 pm

Wednesday, 20 May 2009
18:00 to 20:30

http://www.meetup.com/NOVA-Lucene-Solr-Meetup/ NEW: Live Webcast

Join us for an evening of presentations and discussion on
Lucene/Solr, the Apache Open Source Search Engine/Platform, featuring:

  • Erik Hatcher, Lucid Imagination, Apache Lucene/Solr PMC: Solr power your data: How to get up an running in 20 minutes or less
  • Ryan McKinley: Apache Lucene/Solr PMC: Geo Search with Solr and Voyager
  • Dan Chudnov, Library of Congress: The World Digital Library — Solr searches across

Read more...