July 23, 2009

July 23, 2009
“Control, exploration, flexibility, tunability,” were the answers expounded by representatives of Microsoft, Endeca, and Vivisimo. Relevance is in the eye of the beholder, but relevance ranking is driven by the search engine. Know what criteria are driving the ranking of the results you’re looking at, or at least, be skeptical of them.
I couldn’t agree more with Theresa Regli’s excellent discussion of relevance, especially the point to be “skeptical” of why results…
July 22, 2009
As the Great Recession tests all of our economic patience, many people I know, myself include, have gotten into the habit of looking at graphs of economic indicators. Stock prices, petroleum, unemployment, store closing, health care costs, it’s usually not good news these days. Particularly if you look back at some deep historical horizon, say, since last November. Lots of valleys, plains, with the peaks retreating in the distance.
Then I saw this little gem which Matt…
July 18, 2009
SpanQuerys allow for nested, positional restrictions when matching documents in Lucene. SpanQuery’s are much like PhraseQuerys or MultiPhraseQuerys in that they all restrict term matches by position, but SpanQuerys can be much more expressive.
The basic SpanQuery units are the SpanTermQuery and the SpanNearQuery.
A SpanTermQuery is the most basic SpanQuery, and simply lets you specify a field, term, and boost by passing in a Term, just like a TermQuery. SpanTermQuery is used as a basic building…
July 16, 2009
eWeek.com recently posted a nice article by Dr. Yves Schabes, founder of Teragram, on how to make enterprise search better through some higher order processing techniques like metadata generation, applying taxonomies, etc. and doing relevance testing on a regular basis. Naturally, this got me thinking about all the different ways this relates to the Apache Lucene ecosystem (Lucene, Solr, Mahout, Tika, etc.) and Lucid Imagination.
First, by choosing an open backbone like Lucene and Solr, you are free…
July 7, 2009
| Wednesday, 22 July 2009 | ||
| 18:30 | to | 21:00 |
Agenda:
1. “Faster. Better. Solr! What to look for in Solr 1.4″ – Solr creator Yonik Seeley, Lucid Imagination
2. “How fast is it? Assessing…
July 6, 2009
Solr 1.4 contains a new feature that allows range queries or range filters over arbitrary functions. It’s implemented as a standard Solr QParser plugin, and thus easily available for use any place that accepts the standard Solr Query Syntax by specifying the frange query type. Here’s an example of a filter specifying the lower and upper bounds for a function:
fq={!frange l=0 u=2.2}log(sum(user_ranking,editor_ranking))
The other interesting use for frange is to trade off memory for speed when doing…
July 1, 2009
As virtualization and cloud computing buzz louder, Lucene/Solr open source search is adding a vibe of its own — most recently, with our announcement of our strategic partnership with ISYS technologies. A couple of weeks ago, Business Week wrote up how cloud computing will change business; and in between discussions of VMWare and Amazon’s EC2, tucked in a reference to Xoopit, “[a startup that] has built a specialized search engine capable of finding bits of information…