ASF Interview with Apache Lucene creator Doug Cutting

Read more...

Trends: Know your relevance

“Control, exploration, flexibility, tunability,” were the answers expounded by representatives of Microsoft, Endeca, and Vivisimo. Relevance is in the eye of the beholder, but relevance ranking is driven by the search engine. Know what criteria are driving the ranking of the results you’re looking at, or at least, be skeptical of them.

via Trends: Know your relevance.

I couldn’t agree more with Theresa Regli’s excellent discussion of relevance, especially the point to be “skeptical”…

Read more...

Training: Up and to the right

As the Great Recession tests all of our economic patience, many people I know, myself include, have gotten into the habit of looking at graphs of economic indicators. Stock prices, petroleum, unemployment, store closing, health care costs, it’s usually not good news these days. Particularly if you look back at some deep historical horizon, say, since last November. Lots of valleys, plains, with the peaks retreating in the distance.

Then I saw this little

Read more...

The SpanQuery

SpanQuerys allow for nested, positional restrictions when matching documents in Lucene. SpanQuery’s are much like PhraseQuerys or MultiPhraseQuerys in that they all restrict term matches by position, but SpanQuerys can be much more expressive.

The basic SpanQuery units are the SpanTermQuery and the SpanNearQuery.

A SpanTermQuery is the most basic SpanQuery, and simply lets you specify a field, term, and boost by passing in a Term, just like a TermQuery. SpanTermQuery is used as a…

Read more...

Thoughts on Efficiency of Enterprise Search on eWeek.com

eWeek.com recently posted a nice article by Dr. Yves Schabes, founder of Teragram, on how to make enterprise search better through some higher order processing techniques like metadata generation, applying taxonomies, etc. and doing relevance testing on a regular basis.  Naturally, this got me thinking about all the different ways this relates to the Apache Lucene ecosystem (Lucene, Solr, Mahout, Tika, etc.) and Lucid Imagination.

First, by choosing…

Read more...

NYC Apache Lucene/Solr Meetup, Sponsored by Lucid Imagination and MTV Networks

Wednesday, 22 July 2009
18:30 to 21:00
July 22, 2009, 6:30pm – 9:00 pm Eastern.  Register here.
Hosted at MTV Networks Flagship Building
1515 Broadway, Times Square, New York, NY 10036
RSVP deadline: July 20, 2009 12:00 PM
Presentations and discussion of innovations and applications with Lucene & Solr, the Apache Open Source Search Engine/Platform for the NYC Area. Now

Read more...

Ranges over Functions in Solr 1.4

Solr 1.4 contains a new feature that allows range queries or range filters over arbitrary functions. It’s implemented as a standard Solr QParser plugin, and thus easily available for use any place that accepts the standard Solr Query Syntax by specifying the frange query type. Here’s an example of a filter specifying the lower and upper bounds for a function:

fq={!frange l=0 u=2.2}log(sum(user_ranking,editor_ranking))

The other interesting use for frange is to trade off…

Read more...

Virtual words, real data

As virtualization and cloud computing buzz louder, Lucene/Solr open source search is adding a vibe of its own — most recently, with our announcement of our strategic partnership with ISYS technologies. A couple of weeks ago, Business Week wrote up how cloud computing will change business; and in between discussions of VMWare and Amazon’s EC2, tucked in a reference to Xoopit, “[a startup that] has built a specialized search engine…

Read more...