Lucene 2.9.1 Released

We were all so caught up in the fun at ApacheCon that no one announced the Lucene 2.9.1 release. Its out, and its highly recommended if you are currently on 2.9.0. Check it out: http://lucene.apache.org/java/docs/#6+November+2009+-+Lucene+Java+2.9.1+available

To learn more about what’s new in the Lucene 2.9.1 release, check out these resources:

Read more...

Solr 1.4 Available on Some Mirrors Already

Solr 1.4 won’t officially be released until tomarrow when the announce goes out, but the official dist has already found its way onto some of the mirrors. Try your luck if your antsy ;)

Read more...

Lucene 2.9.1 about to be released

Lucene 2.9.1 should hit the streets this week. 2.9.1 will be a bug fix release and includes a number of important fixes to the Lucene 2.9 release. You can see a list of the current 2.9.1 issues in JIRA here: JIRA 2.9.1 Issues. A couple of these bugs are quite nasty, so this is a highly recommended upgrade.

Why Lucene 2.9.1 rather than simply releasing the almost finished Lucene 3.0 with these bug fixes? After all,…

Read more...

Lucene 2.9 is released

Hello Lucene users,
On behalf of the Lucene dev community (a growing community far larger
than just the committers) I would like to announce the release of
Lucene 2.9.
 
While we generally try and maintain full backwards compatibility
between major versions, Lucene 2.9 has a variety of breaks that are
spelled out in the ‘Changes in backwards compatibility policy’ section
of CHANGES.txt.
 
We recommend that you recompile your application with Lucene 2.9
rather than attempting to “drop” it in. This will alert you to…

Read more...

Contrived FieldCache Load Test: Lucene 2.4 VS Lucene 2.9

*edit* Sorry – jumped the gun with my original test code here – need to close the IndexWriter after the optimize! The gains are only with multi segment indexes. Corrected entry follows:

Lets do a little test. We will load up a FieldCache with 5,000,000 unique strings and see how long it takes Lucene 2.4 in comparison to Lucene 2.9.

Lets use my quad core laptop and the following test code:

public class ContrivedFCTest extends TestCase {
  public void testLoadTime() throws Exception {
    Directory dir = FSDirectory.getDirectory(System.getProperty("java.io.tmpdir") + File.separator + "test");
    IndexWriter writer = new IndexWriter…

Read more...

Lucene 2.9 Release Vote Has Begun

It took a couple more RC’s than I guessed (5 total), but the final vote candidate is up, and unless something critical is found during the 3 day vote process, Lucene 2.9, almost a year in the making, will be available by the end of the week.

http://search.lucidimagination.com/search/document/f15d32710b70ca6b/vote_release_lucene_2_9_0

Read more...

Java Garbage Collection Boot Camp (Draft)

I’m working on a Garbage Collection article – I figured I’d share an early rough draft:

It’s not often the case, but sometimes when working with a large and busy Solr/Lucene installation, Garbage Collection becomes a bottleneck. This guide is meant to help you relieve that bottleneck should it arise.

Garbage collection in Java is the processes of freeing the memory used by objects that are no longer in use. In C or C++ you would be…

Read more...

Lucene 2.9 Release Imminent

The third release candidate for Lucene 2.9 is about to hit and the final release is likely to be only days behind. Almost one year in the making, Lucene 2.9 is feature packed and progressively faster. With Solr 1.4 planning to release very shortly after 2.9, things are shaping up very nicely in Lucene land. Congrats to all the devs involved in both releases – I really think this is the culmination of some really…

Read more...

The SpanQuery

SpanQuerys allow for nested, positional restrictions when matching documents in Lucene. SpanQuery’s are much like PhraseQuerys or MultiPhraseQuerys in that they all restrict term matches by position, but SpanQuerys can be much more expressive.

The basic SpanQuery units are the SpanTermQuery and the SpanNearQuery.

A SpanTermQuery is the most basic SpanQuery, and simply lets you specify a field, term, and boost by passing in a Term, just like a TermQuery. SpanTermQuery is used as a basic building…

Read more...

Bringing the Highlighter back to Wildcard Queries in Solr 1.4

Solr 1.3 and 1.4 moved away from using BooleanQuery expansion for MultiTerm queries and to a ConstantScoreQuery method. In Lucene, a MultiTerm query is a query that expands to match multiple terms based on a given input. Common MultiTerm queries are wildcard, fuzzy, prefix, and range queries. Originally, Lucene supported these MultiTerm queries with an implementation that enumerated the matched terms and then added each as a clause to a BooleanQuery. This is a common…

Read more...