Over the summer I served as a Google Summer of Code mentor for David Nemeskey, PhD student at Eötvös Loránd University. David proposed to improve Lucene’s scoring architecture and implement some state-of-the-art ranking models with the new framework.
These improvements are now committed to Lucene’s trunk: you can use these models in tandem with all of Lucene’s features (boosts, slops, explanations, etc) and queries (term, phrase, spans, etc). A JIRA issue has been created …
Read more
You’re using Solr, or some other Lucene-based search solutions, … or you should and will be! You are (or will be) building your solutions on top of a top-notch search library, Apache Lucene.
Solr makes using Lucene easier – you can index a variety of data sources easily, pretty much out of the box, and you can easily integrate features such as faceting, highlighting, and spellchecking – all without writing Java code. And if that’s …
Read more
| Monday, 15 August 2011 |
| 18:00 |
to |
21:00 |
If you’re in the central VA, or even in the northern VA / DC area, come join us for the inaugural “Charlottesville Solr and Lucene Meetup”. Charlottesville is home to the co-authors of Manning’s “Lucene in Action” and Packt’s Solr “Solr 1.4 Enterprise Search Server” books. This area is a hotbed of search activity thanks to NGIC and DIA calling Charlottesville home, and the many gov’t subcontractors …
Read more
| Tuesday, 12 July 2011 |
to |
Friday, 15 July 2011 |
I had the honor and pleasure of being invited to speak at Überconf last week in the Denver, CO area.
The annual conference is organized by Jay Zimmerman of No Fluff, Just Stuff fame. Überconf has the same top-notch quality, at a grander scale – 10 concurrent tracks (woah!), full day pre-conference trainings (mobile, anyone?), food (full breakfast! that’s a REAL hearty bonus!), and …
Read more
One of the singular qualities of search technology is its breadth: if it’s been written down (albeit digitally), you can search it, and if you can search it, you can build a search app for it. That’s part of what makes Solr/Lucene so alluring for application development — you can build it to search just about anything, for anyone, in any way. Inspiring breadth, however, can be pretty daunting to master.
How, then, can you …
Read more
It’s official, Apache Lucene 3.1.0 and Apache Solr 3.1.0 are officially released. Keep an eye here for more on the new features and functionality.
Here’s the release announcements as just sent to the mailing lists:
March 2011, Apache Lucene 3.1 available
The Lucene PMC is pleased to announce the release of Apache Lucene 3.1.
This release contains numerous bug fixes, optimizations, and
improvements, some of which are highlighted below. The release
is available for immediate
…
Read more
Changing Bits: Lucene’s FuzzyQuery is 100 times faster in 4.0.
So cool… I’m in awe daily of what happens in Lucene and Solr open source. Mike’s post is just a small example of what goes on. Perhaps Mike or Muir or someone will writeup on how Lucene has improved it’s Unit Testing by several orders of magnitude by some incredibly cool randomization techniques and the use of Jenkins/Hudson.…
Read more
Lucene’s default ranking function uses factors such as tf, idf, and norm to help calculate relevancy scores.
Solr has now exposed these factors as function queries.
- docfreq(field,term) returns the number of documents that contain the term in the field.
- termfreq(field,term) returns the number of times the term appears in the field for that document.
- idf(field,term) returns the inverse document frequency for the given term, using the Similarity for the field.
- tf(field,term) returns the
…
Read more
While we make some of our money off of professional services and support of Apache Lucene and Solr, I thought I would pass along a few freebies when it comes to improving your Lucene or Solr application. These are things that we usually end up telling most clients at some stage of the game. Many of them fall under the “broken windows” theory of software development, so don’t expect anything too earth shattering.…
Read more
For those of you in the Raleigh, Durham, Chapel Hill (aka RTP) area, there are a couple of upcoming events on Lucene/Solr that may be of interest:
Tomorrow, Feb. 11, I will be giving a talk at UNC-Chapel Hill on Apache Lucene and Solr. The talk is open to the public. You can read more about the talk at:
Introduction to Open Source Search with Apache Lucene and Solr – CRADLE Seminar | sils.unc.edu.…
Read more