Lucene 2.9.2 and 3.0.1

The vote is on for what I think is a Lucene first – two simultaneous bug fix releases. Because the Lucene 2 series is the last to support Java 1.4, we are doing a bug fix release for for 2.9 as well as the recently released Java 1.5 required 3.0 release.

A little preview from the proposed release announce:

Important improvements in these releases are a increased maximum number of unique terms in each index segment. They…

Read more...

Lucene 2.9.1 Released

We were all so caught up in the fun at ApacheCon that no one announced the Lucene 2.9.1 release. Its out, and its highly recommended if you are currently on 2.9.0. Check it out: http://lucene.apache.org/java/docs/#6+November+2009+-+Lucene+Java+2.9.1+available

To learn more about what’s new in the Lucene 2.9.1 release, check out these resources:

Read more...

Solr 1.4 Available on Some Mirrors Already

Solr 1.4 won’t officially be released until tomorrow when the announcement goes out, but the official dist has already found its way onto some of the mirrors. Try your luck if your antsy ;)

Read more...

Lucene 2.9.1 about to be released

Lucene 2.9.1 should hit the streets this week. 2.9.1 will be a bug fix release and includes a number of important fixes to the Lucene 2.9 release. You can see a list of the current 2.9.1 issues in JIRA here: JIRA 2.9.1 Issues. A couple of these bugs are quite nasty, so this is a highly recommended upgrade.

Why Lucene 2.9.1 rather than simply releasing the almost finished Lucene 3.0 with these bug fixes? After all,…

Read more...

Lucene 2.9 is released

Hello Lucene users,
On behalf of the Lucene dev community (a growing community far larger
than just the committers) I would like to announce the release of
Lucene 2.9.
 
While we generally try and maintain full backwards compatibility
between major versions, Lucene 2.9 has a variety of breaks that are
spelled out in the ‘Changes in backwards compatibility policy’ section
of CHANGES.txt.
 
We recommend that you recompile your application with Lucene 2.9
rather than attempting to “drop” it in. This will alert you to…

Read more...

Contrived FieldCache Load Test: Lucene 2.4 VS Lucene 2.9

*edit* Sorry – jumped the gun with my original test code here – need to close the IndexWriter after the optimize! The gains are only with multi segment indexes. Corrected entry follows:

Lets do a little test. We will load up a FieldCache with 5,000,000 unique strings and see how long it takes Lucene 2.4 in comparison to Lucene 2.9.

Lets use my quad core laptop and the following test code:

public class ContrivedFCTest extends TestCase {
  public void testLoadTime() throws Exception {
    Directory dir = FSDirectory.getDirectory(System.getProperty("java.io.tmpdir") + File.separator + "test");
    IndexWriter writer = new IndexWriter…

Read more...

Lucene 2.9 Release Vote Has Begun

It took a couple more RC’s than I guessed (5 total), but the final vote candidate is up, and unless something critical is found during the 3 day vote process, Lucene 2.9, almost a year in the making, will be available by the end of the week.

http://search.lucidimagination.com/search/document/f15d32710b70ca6b/vote_release_lucene_2_9_0

Read more...

Java Garbage Collection Boot Camp (Draft)

I’m working on a Garbage Collection article – I figured I’d share an early rough draft:

It’s not often the case, but sometimes when working with a large and busy Solr/Lucene installation, Garbage Collection becomes a bottleneck. This guide is meant to help you relieve that bottleneck should it arise.

Garbage collection in Java is the processes of freeing the memory used by objects that are no longer in use. In C or C++ you would be…

Read more...

Lucene 2.9 Release Imminent

The third release candidate for Lucene 2.9 is about to hit and the final release is likely to be only days behind. Almost one year in the making, Lucene 2.9 is feature packed and progressively faster. With Solr 1.4 planning to release very shortly after 2.9, things are shaping up very nicely in Lucene land. Congrats to all the devs involved in both releases – I really think this is the culmination of some really…

Read more...

The SpanQuery

SpanQuerys allow for nested, positional restrictions when matching documents in Lucene. SpanQuery’s are much like PhraseQuerys or MultiPhraseQuerys in that they all restrict term matches by position, but SpanQuerys can be much more expressive.

The basic SpanQuery units are the SpanTermQuery and the SpanNearQuery.

A SpanTermQuery is the most basic SpanQuery, and simply lets you specify a field, term, and boost by passing in a Term, just like a TermQuery. SpanTermQuery is used as a basic building…

Read more...