November 9, 2009
We were all so caught up in the fun at ApacheCon that no one announced the Lucene 2.9.1 release. Its out, and its highly recommended if you are currently on 2.9.0. Check it out: http://lucene.apache.org/java/docs/#6+November+2009+-+Lucene+Java+2.9.1+available
To learn more about what’s new in the Lucene 2.9.1 release, check out these resources:
Read more...
November 9, 2009
Solr 1.4 won’t officially be released until tomarrow when the announce goes out, but the official dist has already found its way onto some of the mirrors. Try your luck if your antsy
Read more...
October 25, 2009
Lucene 2.9.1 should hit the streets this week. 2.9.1 will be a bug fix release and includes a number of important fixes to the Lucene 2.9 release. You can see a list of the current 2.9.1 issues in JIRA here: JIRA 2.9.1 Issues. A couple of these bugs are quite nasty, so this is a highly recommended upgrade.
Why Lucene 2.9.1 rather than simply releasing the almost finished Lucene 3.0 with these bug fixes? After all,…
Read more...
September 22, 2009
*edit* Sorry – jumped the gun with my original test code here – need to close the IndexWriter after the optimize! The gains are only with multi segment indexes. Corrected entry follows:
Lets do a little test. We will load up a FieldCache with 5,000,000 unique strings and see how long it takes Lucene 2.4 in comparison to Lucene 2.9.
Lets use my quad core laptop and the following test code:
public class ContrivedFCTest extends TestCase {
public void testLoadTime() throws Exception {
Directory dir = FSDirectory.getDirectory(System.getProperty("java.io.tmpdir") + File.separator + "test");
IndexWriter writer = new IndexWriter…
Read more...
September 21, 2009
It took a couple more RC’s than I guessed (5 total), but the final vote candidate is up, and unless something critical is found during the 3 day vote process, Lucene 2.9, almost a year in the making, will be available by the end of the week.
http://search.lucidimagination.com/search/document/f15d32710b70ca6b/vote_release_lucene_2_9_0
Read more...
September 19, 2009
I’m working on a Garbage Collection article – I figured I’d share an early rough draft:
It’s not often the case, but sometimes when working with a large and busy Solr/Lucene installation, Garbage Collection becomes a bottleneck. This guide is meant to help you relieve that bottleneck should it arise.
Garbage collection in Java is the processes of freeing the memory used by objects that are no longer in use. In C or C++ you would be…
Read more...
September 6, 2009
The third release candidate for Lucene 2.9 is about to hit and the final release is likely to be only days behind. Almost one year in the making, Lucene 2.9 is feature packed and progressively faster. With Solr 1.4 planning to release very shortly after 2.9, things are shaping up very nicely in Lucene land. Congrats to all the devs involved in both releases – I really think this is the culmination of some really…
Read more...
July 18, 2009
SpanQuerys allow for nested, positional restrictions when matching documents in Lucene. SpanQuery’s are much like PhraseQuerys or MultiPhraseQuerys in that they all restrict term matches by position, but SpanQuerys can be much more expressive.
The basic SpanQuery units are the SpanTermQuery and the SpanNearQuery.
A SpanTermQuery is the most basic SpanQuery, and simply lets you specify a field, term, and boost by passing in a Term, just like a TermQuery. SpanTermQuery is used as a basic building…
Read more...
June 8, 2009
Solr 1.3 and 1.4 moved away from using BooleanQuery expansion for MultiTerm queries and to a ConstantScoreQuery method. In Lucene, a MultiTerm query is a query that expands to match multiple terms based on a given input. Common MultiTerm queries are wildcard, fuzzy, prefix, and range queries. Originally, Lucene supported these MultiTerm queries with an implementation that enumerated the matched terms and then added each as a clause to a BooleanQuery. This is a common…
Read more...