Getting started Spell Checking with Apache Lucene and Solr

Introduction

Recently, I did some minor work on improving the usability of the Lucene spell checker (see LUCENE-2479, LUCENE-2608 and the associated Solr work) and it got me thinking that a post on spell checking in Solr would be useful.

For those who aren’t familiar, the notion of spell checking in search (often called Did You Mean?) is slightly different from the notion of simply correcting spelling errors.  It’s not that we don’t…

Read more...

Brazil Embraces Open Source

Lately, Brazil has been getting a lot of attention (host to the next World Cup in 2014 and site of the first Olympics to be held in South America in 2016), but it also has gotten attention for its embrace of open source technologies. It was my pleasure to speak at an event organized for business executives by our partner in Brazil, Primeware. The topic?—Open source enterprise search software, of course.

I talk about my…

Read more...

Open Source Escrow to the Rescue

Do you remember this scenario from days of yore?

  • Company A buys a software license from Company B, a startup.
  • Company A crosses its fingers that Company B doesn’t go bankrupt and disappear, along with the source code for Company A’s mission-critical software.
  • Company B goes kaput.
  • Company A is left with some machine-readable binary code that it is powerless to develop or use.

Source code escrow has changed the outcome of this…

Read more...

Next Triangle HUG Meeting: Hadoop with Lucene and Solr

Tuesday, 14 September 2010
13:00

Just a quick announcement that I will be speaking on using Hadoop with Lucene and Solr at the next Triangle Hadoop Users Group on Sept. 14.  So, if you are in the Raleigh/Durham/Chapel Hill area on that night, please stop in.  For more information and to RSVP, see Triangle Hadoop Users Group.

Cheers,

Grant

Read more...

SF Bay Area July Lucene Meetup Highlights

If you missed the SF Bay Area Lucene meetup last night, I thought I would give a recap of some of the highlights.  First off, thanks to salesforce.com for the use of their space on the 42nd floor of 1 Market St. in downtown S.F.  The views of the bay and the city were especially stunning at night with what appeared to be a full moon rising over the Bay Bridge.  Salesforce…

Read more...

[UPDATE] Spatial Search in Apache Lucene and Solr

One of the most frequent things I get asked is “what is the state of spatial in Lucene and Solr?”  So here is my answer as of today:

  1. I just committed SOLR-1568 the other day, which adds automatic filter generation to the various point based Field Types in Solr.  It also has some small refactoring in the underlying Lucene code.  Furthermore, it adds a new LatLonType which can be used to represent latitude/longitude

Read more...

The Open Source Legal Maze

As some of you may know, I blog regularly on Network World’s Open Source Subnet. Watch weekly for more of my musings on trends, news and any number of topics that catch my interest. In my most recent post, I ask readers for their take on the legal maze associated with open source. In my opinion, Apache is the most liberal open source package today, the one most true to form. Everybody can use it,…

Read more...

RTP Semantic Web Slides are available

Here are my slides from the talk I gave last night at the RTP Semantic Web Group:

Read more...

Berlin Buzzwords Recap

Back from Berlin Buzzwords and finally over the jet lag, so I thought I would put up some feedback.  First off, it was a well organized conference with a nice focus on searching, storage and scaling.  Kudos to Isabel, Simon and Jan for all their hard work.  It also had great wi-fi coverage, which is always a struggle at every conference I’ve ever been too.

As for the talks, I gave the Keynote on…

Read more...

Lucid Imagination Performance Portal

More often than not, our conversations with customers and the community here at Lucid Imagination revolve around relevancy and performance. It seems to me that these are the two hottest topics around searchers – speed and quality of search.

With this post, I have the pleasure to announce availability of New Relic RPM integrated into the Lucid Imagination performance portal.  This is great news for the people who care about Solr or Lucene performance.

What is

Read more...