Open Source Escrow to the Rescue

Do you remember this scenario from days of yore?

  • Company A buys a software license from Company B, a startup.
  • Company A crosses its fingers that Company B doesn’t go bankrupt and disappear, along with the source code for Company A’s mission-critical software.
  • Company B goes kaput.
  • Company A is left with some machine-readable binary code that it is powerless to develop or use.

Source code escrow has changed the outcome of this…

Read more...

ASF Interview with Apache Lucene creator Doug Cutting

Read more...

Thoughts on Efficiency of Enterprise Search on eWeek.com

eWeek.com recently posted a nice article by Dr. Yves Schabes, founder of Teragram, on how to make enterprise search better through some higher order processing techniques like metadata generation, applying taxonomies, etc. and doing relevance testing on a regular basis.  Naturally, this got me thinking about all the different ways this relates to the Apache Lucene ecosystem (Lucene, Solr, Mahout, Tika, etc.) and Lucid Imagination.

First, by choosing…

Read more...

Ranges over Functions in Solr 1.4

Solr 1.4 contains a new feature that allows range queries or range filters over arbitrary functions. It’s implemented as a standard Solr QParser plugin, and thus easily available for use any place that accepts the standard Solr Query Syntax by specifying the frange query type. Here’s an example of a filter specifying the lower and upper bounds for a function:

fq={!frange l=0 u=2.2}log(sum(user_ranking,editor_ranking))

The other interesting use for frange is to trade off…

Read more...

PyLucene 2.4.1 Released

Here’s the announcement from the PyLucene team:
This is a refresher release of Apache PyLucene 2.4.1 that addresses a few bugs and annoyances:

http://svn.apache.org/repos/asf/lucene/pylucene/tags/pylucene_2_4_1/CHANGES

http://svn.apache.org/repos/asf/lucene/pylucene/tags/pylucene_2_4_1/jcc/CHANGES

Apache PyLucene 2.4.1 is available from the following download page:
http://www.apache.org/dyn/closer.cgi/lucene/pylucene/pylucene-2.4.1-2-src.tar.gz

When downloading from a mirror site, please remember to verify the downloads using signatures found on the Apache site:
http://www.apache.org/dist/lucene/pylucene/KEYS

For more information on Apache PyLucene, visit the project home page:
http://lucene.apache.org/pylucene

Read more...

Exploring Lucene and Solr’s TrieRange Capabilities

Recently, Uwe Schindler and others have added a new capability to Lucene and Solr to make working with numeric ranges a lot faster.  I haven’t tried out this new functionality yet, so I thought I would walk through it here and explore it’s capabilities.

Since Lucene treats most everything as Strings, encoding  numbers and dates and then utilizing them in ranges has always required a little extra work to make it perform well.  Previously, one…

Read more...

The Next Lucene Release

It looks like the next release of Lucene is going to be 2.4.1, a bug fix release. The Lucene release ‘animal’ has raised its head over the previous months on two occasions, once eyeing a 2.4.1 release, then refocusing on a 2.9 release. Time has seen 2.4.1 land a few more bugs than we had though, so it looks like 2.4.1 is in the final wrapup stages and 2.9 will come next.

2.9 will likely…

Read more...

Exploring Query Parsers

There are a surprising number of query parser options in the Lucene/Solr world – not something I realized very quickly in my early Lucene days. I thought I might highlight a few of the options out there.

Read more...