Apache Lucene 2.9.2 and 3.0.1 Released

Here’s the announcement:

Hello Lucene users,

On behalf of the Lucene development community I would like to announce the release of Lucene Java versions 3.0.1 and 2.9.2:

Both releases fix bugs in the previous versions:

- 2.9.2 is a bugfix release for the Lucene Java 2.x series, based on Java 1.4
- 3.0.1 has the same bug fix level but is for the Lucene Java 3.x series, based on Java 5.

New users of Lucene are advised to use version 3.0.1…

Read more...

Intro to Mahout at Triangle Java User Group on Feb. 15

Monday, 15 February 2010
18:00 to 21:00

I will be giving an introduction to Apache Mahout at the Triangle Java User Group on Feb. 15.  See http://trijug.org/ for more details.  Hope to see you there!

Read more...

Apache Lucene Connector Framework now in Incubation at the ASF

Short Version

The Apache Lucene Connector Framework project has officially entered incubation.  LCF, for short, is going to be a framework for connecting to content repositories like Sharepoint, Documentum, etc. and will make it easy to hook into Lucene, Solr, Nutch, Mahout, Tika, while, of course, remaining agnostic of the final destination of the data.  See the Connectors website and the original proposal for more info.  Help wanted!

Long Version

Background

A while back, MetaCarta, a spatial search company, approached us…

Read more...

The Apache Lucene Ecosystem: My view of 2009

It’s that time of year, so I thought I would take a look back at the year that was for the Lucene Ecosystem and maybe look ahead just a little bit too.

First and foremost, it should be obvious to even the most casual observer that the Apache Lucene communities are thriving.  Not only is it a great time to be involved in open source, it’s a great time to be involved in Lucene.  Both as a…

Read more...

Apache Solr 1.5 on the move with more “functionality”

The paint is barely dry on Apache Solr 1.4 and the community is already on the move for Solr 1.5 (which may actually be Solr 2.0, but for now let’s call it 1.5).

I’m particularly excited about a few things:

  1. Massive scalability capabilities via distributed search, indexing and shard management – Up until now, Solr scales pretty well on the search side (I’ve seen billion+ document instances and we’ve benchmarked it at that level too), but the work…

Read more...

Fun with Solr Functions

For a long time now, Solr has had a good chunk of functions available for use to boost relevance based on the content of a field, but I’ve always been on the user side of them and never on the writing side.  At least, that is, until recently.  This week I have been putting the finishing touches on an article on using Lucene and Solr for spatial search.  As part of the article, I had a…

Read more...

Apache Mahout 0.2 Released

I just sent out the Apache Mahout 0.2 release announcement.  Here’s a copy:

Apache Mahout 0.2 has been released and is now available for public
download at http://www.apache.org/dyn/closer.cgi/lucene/mahout

Apache Mahout is a subproject of Apache Lucene with the goal
of delivering scalable machine learning algorithm implementations
under the Apache license. http://www.apache.org/licenses/LICENSE-2.0
Scale in terms of computation to the
size of data you manage today.  Scale in terms of community to support anyone
interested in using machine learning. Scale
in terms of business by providing…

Read more...

Apache Solr 1.4 is officially released

After many months of hard work, Solr 1.4 is completed and released. To learn more about the 1.4 release, check out:

The Certified Distribution of Solr 1.4, the fully tested version with lots of useful tools and 30 days free getting started assistance, will follow shortly. Click here to be informed when…

Read more...

Come to the Lucene Meetup at ApacheCon in Oakland!

Tuesday, 3 November 2009
20:00 to 22:00

Come visit us at the Lucene Meetup at ApacheCon on Tuesday 11/3 from 8-10pm. All are welcome to come – there is no cost for this event. Come meet many of the key contributors to Lucene and Solr. Sponsored by Lucid Imagination.

Location: Marriott Oakland City Center, Rooms 1&2

For more information about the meetup, visit
http://wiki.apache.org/lucene-java/LuceneAtApacheConUs2009

Read more...

Posting Rich Documents to Apache Solr using SolrJ and Solr Cell (Apache Tika)

Solr Cell, a new feature in the soon to be released Solr 1.4, allows users to send in rich documents such as MS Word and Adobe PDF directly into Solr and have them indexed for search.  All of the examples on the Solr Cell wiki page, however only demonstrate how to send in the documents using the curl command line utility, while many Solr users rely on SolrJ, Solr’s Java-based client.  Thus, I thought I…

Read more...