<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Lucid Imagination &#187; apache</title>
	<atom:link href="http://www.lucidimagination.com/blog/tag/apache/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.lucidimagination.com/blog</link>
	<description>Exclusively dedicated to Apache Lucene/Solr open source search technology</description>
	<lastBuildDate>Sat, 04 Feb 2012 01:12:03 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Yonik Seeley and Mark Miller to Speak at NYJavaSIG 11/18 6:30PM ET</title>
		<link>http://www.lucidimagination.com/blog/2010/11/10/yonik-seeley-and-mark-miller-to-speak-at-nyjavasig-1118-630pm-et/</link>
		<comments>http://www.lucidimagination.com/blog/2010/11/10/yonik-seeley-and-mark-miller-to-speak-at-nyjavasig-1118-630pm-et/#comments</comments>
		<pubDate>Wed, 10 Nov 2010 18:30:34 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Lucid Imagination]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[apache]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[enterprise search]]></category>
		<category><![CDATA[Lucene]]></category>
		<category><![CDATA[lucene revolution]]></category>
		<category><![CDATA[lucid imagination]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Solr]]></category>
		<category><![CDATA[solr cloud]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=2674</guid>
		<description><![CDATA[[ Thursday, 18 November 2010; 15:30 to 18:00. ] <p>We’re pleased to be able to speak at the NYJavaSIG monthly meeting on November 18. Yonik Seeley, the creator of Solr and co-founder of Lucid Imagination, will provide a summary of new developments in Solr, and will talk about how developers can leverage this new functionality. In addition, Lucene/Solr committer Mark Miller will talk about scaling Solr across many servers, the Solr Cloud initiative, Apache Zookeeper, &#8230;</p>]]></description>
			<content:encoded><![CDATA[[ Thursday, 18 November 2010; 15:30 to 18:00. ] <p>We’re pleased to be able to speak at the NYJavaSIG monthly meeting on November 18. Yonik Seeley, the creator of Solr and co-founder of Lucid Imagination, will provide a summary of new developments in Solr, and will talk about how developers can leverage this new functionality. In addition, Lucene/Solr committer Mark Miller will talk about scaling Solr across many servers, the Solr Cloud initiative, Apache Zookeeper, and new Solr Cloud features on the horizon.</p>
<p>If you are in the NYC area on November 18, please join us!</p>
<p>To register, please go to: <a href="http://www.javasig.com/meeting/show/35">http://www.javasig.com/meeting/show/35</a></p>
<p>When: November 18, 2010 6:30PM<br />
Where: Credit Suisse &#8211; 11 Madison Avenue, New York, NY 10010</p>
<p><strong>About the NYJavaSIG: </strong>The New York Java Special Interest Group (NYJavaSIG) was founded in  1995 by Frank Greco. Today there are well over 6,500 members in the  NYJavaSIG, and attendance at the monthly meetings averages over 150  attendees. The NYJavaSIG has hosted meetings for developers from New York, New  Jersey and Connecticut, and has brought together members of the local  Java community through its website and monthly get-togethers to share  tips, techniques, knowledge, and experience.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2010/11/10/yonik-seeley-and-mark-miller-to-speak-at-nyjavasig-1118-630pm-et/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data.gov on Solr</title>
		<link>http://www.lucidimagination.com/blog/2010/11/05/data-gov-on-solr/</link>
		<comments>http://www.lucidimagination.com/blog/2010/11/05/data-gov-on-solr/#comments</comments>
		<pubDate>Fri, 05 Nov 2010 21:43:44 +0000</pubDate>
		<dc:creator>Erik Hatcher</dc:creator>
				<category><![CDATA[ApacheCon]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[LucidWorks]]></category>
		<category><![CDATA[Solr]]></category>
		<category><![CDATA[apache]]></category>
		<category><![CDATA[Erik Hatcher]]></category>
		<category><![CDATA[Open Source]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=2604</guid>
		<description><![CDATA[<p>At <a href="http://apachecon.com">ApacheCon</a> this week I presented <a href="http://na.apachecon.com/c/acna2010/sessions/571">&#8220;Rapid Prototyping with Solr&#8221;</a>.  This is the third time I&#8217;ve given a presentation with the same title.  In the spirit of the rapid prototyping theme, each time I&#8217;ve created a new prototype just a day or so prior to presenting it.  At <a href="http://lucene-eurocon.org/sessions-track2-day2.html#4">Lucene EuroCon</a> the prototype used attendee data, a treemap visualization, and a cute little Solr-powered &#8220;app&#8221; for picking attendees at random for the conference giveaways.  For &#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>At <a href="http://apachecon.com">ApacheCon</a> this week I presented <a href="http://na.apachecon.com/c/acna2010/sessions/571">&#8220;Rapid Prototyping with Solr&#8221;</a>.  This is the third time I&#8217;ve given a presentation with the same title.  In the spirit of the rapid prototyping theme, each time I&#8217;ve created a new prototype just a day or so prior to presenting it.  At <a href="http://lucene-eurocon.org/sessions-track2-day2.html#4">Lucene EuroCon</a> the prototype used attendee data, a treemap visualization, and a cute little Solr-powered &#8220;app&#8221; for picking attendees at random for the conference giveaways.  For a recent <a href="http://www.lucidimagination.com/blog/2010/06/10/rapid-prototyping-search-applications-with-solr/">Lucid webinar</a> the prototype was more general purpose, bringing in and making searchable rich documents and faceting on file types with a pie chart visualization.</p>
<p>This time around, the data set I chose was <a href="http://www.data.gov/raw/92">Data.gov&#8217;s catalog of datasets</a>, which fit with the ApacheCon open source aura, and Lucid Imagination&#8217;s support of <a href="http://opensourceforamerica.org/awards/2010-recipients">Open Source for America</a>, which helps to advocate for open source in the US Federal Government.  The prototype built includes faceting browsing, query term suggest, hit highlighting, result clustering, spell checking, document detail, and a bonus Venn diagram visualization.</p>
<p><span id="more-2604"></span></p>
<p>The prototype was built with these steps:</p>
<ol>
<li>Install LucidWorks for Solr</li>
<li>Grab the Data.gov catalog CSV file</li>
<li>Iterate a bit with Solr&#8217;s CSV update handler (the funnest way to get data into Solr) and a little Solr schema tinkering</li>
<li>Adjusted the Solr configuration and UI templates to get a nice look and feel, adding in a document detail page and a Venn diagram visualization comparing query overlaps</li>
</ol>
<p>Voilà (click the images for large view):</p>
<table class="plain" style="width: 100%;" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td width="60%"><a href="http://www.lucidimagination.com/blog/wp-content/uploads/2010/11/datagov_search.png"><img class="alignnone size-thumbnail wp-image-2617" title="Data.gov on Solr" src="http://www.lucidimagination.com/blog/wp-content/uploads/2010/11/datagov_search-150x150.png" alt="" width="150" height="150" /></a></td>
<td><a href="http://www.lucidimagination.com/blog/wp-content/uploads/2010/11/datagov_compare.png"><img class="size-thumbnail wp-image-2627" title="query comparison Venn diagram" src="http://www.lucidimagination.com/blog/wp-content/uploads/2010/11/datagov_compare-150x150.png" alt="" width="150" height="150" /></a></td>
</tr>
</tbody>
</table>
<p>This isn&#8217;t the first time we&#8217;ve toyed with Data.gov data&#8230; earlier this year, <a href="../../../../../../blog/2010/05/07/data-mining-data-dot-gov/">Hoss demonstrated Solr&#8217;s stats component</a> on another of Data.gov&#8217;s data sets.</p>
<p>My ApacheCon slides are published at Slideshare and embedded here:</p>
<div id="__ss_5675936" style="width: 425px;"><strong><a title="Rapid prototyping with solr" href="http://www.slideshare.net/erikhatcher/rapid-prototyping-with-solr-5675936">Rapid prototyping with solr</a></strong><object id="__sse5675936" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="355" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowScriptAccess" value="always" /><param name="src" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=rapidprototypingwithsolr-101105050018-phpapp01&amp;stripped_title=rapid-prototyping-with-solr-5675936&amp;userName=erikhatcher" /><param name="name" value="__sse5675936" /><param name="allowfullscreen" value="true" /><embed id="__sse5675936" type="application/x-shockwave-flash" width="425" height="355" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=rapidprototypingwithsolr-101105050018-phpapp01&amp;stripped_title=rapid-prototyping-with-solr-5675936&amp;userName=erikhatcher" name="__sse5675936" allowscriptaccess="always" allowfullscreen="true"></embed></object></div>
<p>All the code and instructions for running the entire prototype yourself can be found here: <a href="https://github.com/erikhatcher/solr-rapid-prototyping/tree/master/ApacheCon2010">https://github.com/erikhatcher/solr-rapid-prototyping/tree/master/ApacheCon2010</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2010/11/05/data-gov-on-solr/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Summary of first ever RTP (Raleigh/Chapel Hill/Durham) Apache Lucene/Solr Meetup</title>
		<link>http://www.lucidimagination.com/blog/2010/09/29/summary-of-first-ever-rtp-raleighchapel-hilldurham-apache-lucenesolr-meetup/</link>
		<comments>http://www.lucidimagination.com/blog/2010/09/29/summary-of-first-ever-rtp-raleighchapel-hilldurham-apache-lucenesolr-meetup/#comments</comments>
		<pubDate>Wed, 29 Sep 2010 12:55:42 +0000</pubDate>
		<dc:creator>Grant Ingersoll</dc:creator>
				<category><![CDATA[apache]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[Lucene]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Solr]]></category>
		<category><![CDATA[auto suggest]]></category>
		<category><![CDATA[faceting]]></category>
		<category><![CDATA[Grant Ingersoll]]></category>
		<category><![CDATA[solr 4.0]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=2499</guid>
		<description><![CDATA[<p>A week and a day later, I&#8217;ve finally got a chance to put up my thoughts/notes on the first ever RTP <a href="http://lucene.apache.org">Apache Lucene/Solr</a> Meetup hosted by <a href="http://www.lulu.com/">Lulu Press</a> and co-sponsored by Lucid Imagination.</p>
<p>First off, hats off to Lulu for the excellent hosting, coordination and marketing of the event.  You could definitely see the evidence of Lulu&#8217;s &#8220;Be Remarkable&#8221; philosophy in the event. I&#8217;d say we had roughly 30-40 people for the first time event, &#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>A week and a day later, I&#8217;ve finally got a chance to put up my thoughts/notes on the first ever RTP <a href="http://lucene.apache.org">Apache Lucene/Solr</a> Meetup hosted by <a href="http://www.lulu.com/">Lulu Press</a> and co-sponsored by Lucid Imagination.</p>
<p>First off, hats off to Lulu for the excellent hosting, coordination and marketing of the event.  You could definitely see the evidence of Lulu&#8217;s &#8220;Be Remarkable&#8221; philosophy in the event. I&#8217;d say we had roughly 30-40 people for the first time event, with a good mix of developers, technical managers and a few recruiters.  There was even a &#8220;competitor&#8221; from an unnamed proprietary vendor present.  On the application front, there was a large mix of usages represented: ecommerce, publishing, video search, procurement, biopharma, etc.</p>
<p>After some socialization, we kicked off the night with Lulu CEO <a href="http://en.wikipedia.org/wiki/Bob_Young_%28businessman%29">Bob Young</a>, who gave a short intro to Lulu as well as a warm welcome to all.  Next up, I gave a talk (<a href="http://files.meetup.com/1698968/newLuceneSolr-sept2010.pptx">slides</a>) on what&#8217;s coming in Lucene/Solr 3.x and beyond as well as answered some questions about features and functionality.  After me, Tarun Jain of <a href="http://www.abb.com/">The ABB Group</a>, one of Lucid&#8217;s first customers and the world&#8217;s largest producer of industrial robots as well as a global leader in power and industrial automation with revenues around $33B USD, gave a presentation titled &#8220;Extreme Faceting Using Solr&#8221; (<a href="http://files.meetup.com/1698968/Extreme%20Faceting%20using%20SOLR.ppt">slides</a>) on their move from a legacy proprietary vendor to Solr for searching all of their customer facing (and internal) product catalog (420K SKUs with 20+ million attributes and over 6M hits per month).   After setting the stage about the content to be searched and faceted, Tarun detailed how they went from wanting to &#8220;do everything in the DB&#8221; to doing nearly everything in Solr because it was that easy.  Moreover, slide 8 details the comparison they did between Solr and a very large proprietary search vendor (one of the so called top 3).  Here are the bullet points:</p>
<ol>
<li>
<div>Stress test results in Proof of concept</div>
<ol>
<li>
<div>SOLR 35 req/sec vs 2 req/sec</div>
</li>
<li>
<div>Average response times 200 ms vs 1-7 secs</div>
</li>
<li>
<div>CPU usage 2-3% vs 100%</div>
</li>
</ol>
</li>
<li>Sadly matchup was not even close (at least for the scenarios we tested for)</li>
<li>
<div><strong>Conclusion .. Performance of SOLR is inversely proportional to the cost</strong></div>
</li>
<li>
<div>Winner – SOLR by a KO</div>
</li>
</ol>
<p>After Tarun&#8217;s talk, Paul Oakes from Lulu gave an excellent technical presentation (<a href="http://files.meetup.com/1698968/Implementing%20Autocomplete%20with%20Solr%20and%20jQuery.ppt">slides</a>) on implementing auto-suggest in Solr using <a href="http://jquery.com/">jQuery</a>.  Just for grins, he also showed how trivial it was to add Google&#8217;s much hyped &#8220;Instant&#8221; search capability to Solr as well simply by making an extra jQuery call.  Naturally, the real work behind &#8220;Instant&#8221; is in capacity planning at scale, not in the programming of a few lines of Javascript.</p>
<p>As for the RTP meetup in general, I would suspect we will try to meet once a quarter, but maybe more often if the group so desires.</p>
<p>All in all, an excellent night, in my opinion.  Best of all it was a &#8220;home&#8221; event for me, so I didn&#8217;t have to fly anywhere or bum a ride back to a hotel!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2010/09/29/summary-of-first-ever-rtp-raleighchapel-hilldurham-apache-lucenesolr-meetup/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>ASF Interview with Apache Lucene creator Doug Cutting</title>
		<link>http://www.lucidimagination.com/blog/2009/07/23/asf-interview-with-apache-lucene-creator-doug-cutting/</link>
		<comments>http://www.lucidimagination.com/blog/2009/07/23/asf-interview-with-apache-lucene-creator-doug-cutting/#comments</comments>
		<pubDate>Thu, 23 Jul 2009 23:09:54 +0000</pubDate>
		<dc:creator>Grant Ingersoll</dc:creator>
				<category><![CDATA[apache]]></category>
		<category><![CDATA[Lucene]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=879</guid>
		<description><![CDATA[<p>&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p><object width="425" height="344" data="http://www.youtube.com/v/XyDQAY9dwsQ&amp;hl=en&amp;fs=1&amp;" type="application/x-shockwave-flash"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/XyDQAY9dwsQ&amp;hl=en&amp;fs=1&amp;" /><param name="allowfullscreen" value="true" /></object></p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2009/07/23/asf-interview-with-apache-lucene-creator-doug-cutting/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thoughts on Efficiency of Enterprise Search on eWeek.com</title>
		<link>http://www.lucidimagination.com/blog/2009/07/16/thoughts-on-efficiency-of-enterprise-search-on-eweekcom/</link>
		<comments>http://www.lucidimagination.com/blog/2009/07/16/thoughts-on-efficiency-of-enterprise-search-on-eweekcom/#comments</comments>
		<pubDate>Thu, 16 Jul 2009 15:11:39 +0000</pubDate>
		<dc:creator>Grant Ingersoll</dc:creator>
				<category><![CDATA[Enterprise Search]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[Lucene]]></category>
		<category><![CDATA[Mahout]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Solr]]></category>
		<category><![CDATA[apache]]></category>
		<category><![CDATA[OpenNLP]]></category>
		<category><![CDATA[Tika]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=796</guid>
		<description><![CDATA[<p>eWeek.com recently posted a <a href="http://www.eweek.com/c/a/Search-Engines/How-to-Improve-the-Efficiency-of-Enterprise-Search/">nice article</a> by Dr. Yves Schabes, founder of <a href="http://www.teragram.com/">Teragram</a>, on how to make enterprise search better through some higher order processing techniques like metadata generation, applying taxonomies, etc. and doing <a href="http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Debugging-Relevance-Issues-Search">relevance testing</a> on a regular basis.  Naturally, this got me thinking about all the different ways this relates to the Apache Lucene ecosystem (<a href="http://lucene.apache.org/java">Lucene</a>, <a href="http://lucene.apache.org/solr">Solr</a>, <a href="http://lucene.apache.org/mahout">Mahout</a>, <a href="http://lucene.apache.org/tika">Tika</a>, etc.) and Lucid Imagination.</p>
<p>First, by choosing an &#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>eWeek.com recently posted a <a href="http://www.eweek.com/c/a/Search-Engines/How-to-Improve-the-Efficiency-of-Enterprise-Search/">nice article</a> by Dr. Yves Schabes, founder of <a href="http://www.teragram.com/">Teragram</a>, on how to make enterprise search better through some higher order processing techniques like metadata generation, applying taxonomies, etc. and doing <a href="http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Debugging-Relevance-Issues-Search">relevance testing</a> on a regular basis.  Naturally, this got me thinking about all the different ways this relates to the Apache Lucene ecosystem (<a href="http://lucene.apache.org/java">Lucene</a>, <a href="http://lucene.apache.org/solr">Solr</a>, <a href="http://lucene.apache.org/mahout">Mahout</a>, <a href="http://lucene.apache.org/tika">Tika</a>, etc.) and Lucid Imagination.</p>
<p>First, by choosing an open backbone like Lucene and Solr, you are free to plugin the best tool for the job; proprietary solutions often limit you to their own tools and their implementation.  Let&#8217;s face it, we can&#8217;t be good at everything, so it makes sense to be able to plug in the best of breed for something that isn&#8217;t a core competency.  For example, one could choose <a href="http://opennlp.sourceforge.net/">OpenNLP</a> or Teragram or any other commercial vendor for these capabilities.  Solr, especially, makes it simple to plugin these capabilities through its well defined plugin architecture.  (By the way, for almost every capability out there in this realm, there is an open source alternative that warrants investigation.)</p>
<p>Second, intelligent search&#8211;in other words, search that goes beyond simple keyword capabilities&#8211;is the leading edge of the field and is being adopted in more and more products, just as Dr. Schabes recommends.  Whether it is intelligent query parsing, better faceting and discovery capabilities or integration with natural language processing (NLP) tools for NER (Named Entity Recognition), sentiment analysis and relationship discovery, the companies making a difference in search are those that intelligently bring together a variety of approaches to solve the problem at hand.  I believe Lucene,  Solr and open source are uniquely positioned to fuel intelligent search because they drive down the <a href="http://www.lucidimagination.com/blog/2009/04/20/lucene-open-source-and-the-cost-of-experimentation/">cost of experimentation</a> simply because it takes effort to get this stuff right, much of it due to the need to understand your domain and how to translate it into a good user experience.  Furthermore, open source lets you cost effectively fill in your infrastructure and conserve your precious resources for your core competencies.  Why would you pay millions of dollars for a search engine that implements a <a href="http://en.wikipedia.org/wiki/Vector_space_model">vector space retrieval model</a> (which most of the commercial vendors do) when you can get the same thing from Lucene for free?  If you suspect that you think Lucene isn&#8217;t as good, think again; there&#8217;s a reason it is used at the likes of Apple, AOL, Comcast, CNET, Viacom and thousands of others.  If you like bells and whistles and knowing there is a company behind your chosen solution, I&#8217;ll do you three better:  with Lucene and Solr you not only get 1) a <a href="http://www.lucidimagination.com">company that offers support, training, professional services, and bells and whistles</a>, you also get 2) the very large Apache community of users as well who constantly use/test/fix/improve the software and 3) all of the source code,  completely unencumbered, so you are free to change it as you see fit.</p>
<p>Finally, you get to choose whether you even need a particular capability.  On more than one occasion, I have been involved in replacement of a proprietary search package so bloated with unused add-ons that the Solr installation, containing only the required functions, needed an index that was a mere fraction of the size of the proprietary solution, resulting in:</p>
<ul>
<li>Less hardware to achieve the same throughput</li>
<li>Less operations costs &#8212; more hardware = more hardware failures</li>
<li>faster indexing, faster queries, etc.</li>
</ul>
<p>In short, Lucene and Solr offer a cost effective and fully capable mechanism for improving the efficiency of search along the lines of the approach Dr. Shabes recommends, giving you the freedom to choose based on your idea of what works, not someone else&#8217;s.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2009/07/16/thoughts-on-efficiency-of-enterprise-search-on-eweekcom/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Apache Nutch 1.0 released</title>
		<link>http://www.lucidimagination.com/blog/2009/03/28/apache-nutch-10-released/</link>
		<comments>http://www.lucidimagination.com/blog/2009/03/28/apache-nutch-10-released/#comments</comments>
		<pubDate>Sat, 28 Mar 2009 20:08:48 +0000</pubDate>
		<dc:creator>Sami Siren</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[apache]]></category>
		<category><![CDATA[nutch]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=578</guid>
		<description><![CDATA[<p>Apache <a href="http://lucene.apache.org/nutch/">Nutch</a>, a subproject of Apache Lucene, is open source web-search software. It builds on Lucene Java, adding web-specifics, such as a crawler, a link-graph database, parsers for HTML and other document formats.</p>
<p>Apache Nutch 1.0 contains almost <a href="https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&#38;pid=10680&#38;fixfor=12312443">200</a> resolved issues and improvements such as <a href="http://www.lucidimagination.com/blog/2009/03/09/nutch-solr/">Solr Integration</a>, new indexing framework and <a href="http://wiki.apache.org/nutch/NewScoringIndexingExample">new scoring framework</a> just to mention a few.</p>
<p>Nutch 1.0 is available from <a href="http://www.apache.org/dyn/closer.cgi/lucene/nutch/nutch-1.0.tar.gz">here</a>.&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>Apache <a href="http://lucene.apache.org/nutch/">Nutch</a>, a subproject of Apache Lucene, is open source web-search software. It builds on Lucene Java, adding web-specifics, such as a crawler, a link-graph database, parsers for HTML and other document formats.</p>
<p>Apache Nutch 1.0 contains almost <a href="https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&amp;pid=10680&amp;fixfor=12312443">200</a> resolved issues and improvements such as <a href="http://www.lucidimagination.com/blog/2009/03/09/nutch-solr/">Solr Integration</a>, new indexing framework and <a href="http://wiki.apache.org/nutch/NewScoringIndexingExample">new scoring framework</a> just to mention a few.</p>
<p>Nutch 1.0 is available from <a href="http://www.apache.org/dyn/closer.cgi/lucene/nutch/nutch-1.0.tar.gz">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2009/03/28/apache-nutch-10-released/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

