<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Lucid Imagination &#187; Uncategorized</title>
	<atom:link href="http://www.lucidimagination.com/blog/category/uncategorized/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.lucidimagination.com/blog</link>
	<description>Exclusively dedicated to Apache Lucene/Solr open source search technology</description>
	<lastBuildDate>Sat, 04 Feb 2012 01:12:03 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Lucene Revolution 2012 &#8211; Call for Participation now open!</title>
		<link>http://www.lucidimagination.com/blog/2012/01/30/lucene-revolution-2012-call-for-participation-now-open/</link>
		<comments>http://www.lucidimagination.com/blog/2012/01/30/lucene-revolution-2012-call-for-participation-now-open/#comments</comments>
		<pubDate>Mon, 30 Jan 2012 20:35:10 +0000</pubDate>
		<dc:creator>Ameena</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=4662</guid>
		<description><![CDATA[<p><a href="http://lucenerevolution.org/blog/wp-content/uploads/2012/01/LR2012_banner676x192.png"><img class="alignright size-medium wp-image-283" title="LR2012_banner676x192" src="http://lucenerevolution.org/blog/wp-content/uploads/2012/01/LR2012_banner676x192-300x85.png" alt="" width="300" height="85" /></a>Mark your calendars today! The largest worldwide conference dedicated to Lucene and Solr will take place in Boston May 7-10.</p>
<p>The 2012 conference will build on the success of last year&#8217;s Lucene Revolution in San Francisco. Sponsored by Lucid Imagination with additional support from community and other commercial co-sponsors, we&#8217;ll be adding new sessions, new speakers, and new training sessions to the agenda. Lucid Imagination is the commercial entity exclusively dedicated to Apache Lucene/Solr open &#8230;</p>]]></description>
			<content:encoded><![CDATA[<p><a href="http://lucenerevolution.org/blog/wp-content/uploads/2012/01/LR2012_banner676x192.png"><img class="alignright size-medium wp-image-283" title="LR2012_banner676x192" src="http://lucenerevolution.org/blog/wp-content/uploads/2012/01/LR2012_banner676x192-300x85.png" alt="" width="300" height="85" /></a>Mark your calendars today! The largest worldwide conference dedicated to Lucene and Solr will take place in Boston May 7-10.</p>
<p>The 2012 conference will build on the success of last year&#8217;s Lucene Revolution in San Francisco. Sponsored by Lucid Imagination with additional support from community and other commercial co-sponsors, we&#8217;ll be adding new sessions, new speakers, and new training sessions to the agenda. Lucid Imagination is the commercial entity exclusively dedicated to Apache Lucene/Solr open source search technology.</p>
<p>Registration will begin shortly &#8211; so make sure to save-the-date. In the meantime, the Call For Participation (CFP) is now open for Lucene Revolution 2012. If you have a great Solr or Lucene talk, this is a fantastic opportunity to share it with the community.</p>
<p>To submit a proposal for a 45-minute presentation, please complete the form at: <a href="http://lucenerevolution.org/Call_for_Participation">http://lucenerevolution.org/Call_for_Participation</a></p>
<p>Topics of interest include: <a href="http://lucenerevolution.org/blog/wp-content/uploads/2012/01/collage_black.png"><img class="alignright size-medium wp-image-284" title="collage_black" src="http://lucenerevolution.org/blog/wp-content/uploads/2012/01/collage_black-300x225.png" alt="" width="300" height="225" /></a><br />
- Lucene and Solr in the Enterprise (case studies, implementation, return on investment, etc.)<br />
- Use of LucidWorks Enterprise<br />
- &#8220;How We Did It&#8221; development case studies<br />
- Lucene/Solr technology deep dives: features, how to use, etc.<br />
- Spatial/Geo/local search<br />
- Lucene and Solr in the Cloud<br />
- Scalability and performance tuning<br />
- Large Scale Search<br />
- Real Time Search (or NRT search)<br />
- Data Integration/Data Management<br />
- Lucene &amp; Solr for Mobile Applications<br />
- Associated technologies: Mahout, Nutch, NLP, etc.</p>
<p>All accepted speakers will get complimentary conference passes. Financial assistance is available for speakers that qualify.</p>
<p>Submissions must be received by March 9th , 2012 , 12 Midnight PST</p>
<p><strong>KEY DATES</strong><br />
March 9, 2012: Call for Participation Closes<br />
March 23, 2012: Speaker Acceptance/Rejection Notification<br />
May 7-8, 2012: Training<br />
May 9-10, 2012: Conference</p>
<p>If you have more than one topic that you would like to propose, please complete an additional online form. To be considered, proposals must be received by 12 Midnight PDT, March 9th, 2012.</p>
<p>Interested in registration or other conference news? Want to be added to the conference mailing list? Is your organization interested in sponsorship opportunities? Please send an email to: <a href="mailto:info@lucenerevolution.org">info@lucenerevolution.org</a></p>
<p>We look forward to seeing you in Boston!</p>
<p>Cross post from Lucene Revolution blog.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2012/01/30/lucene-revolution-2012-call-for-participation-now-open/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SolrCloud is Coming (and looking to mix in even more &#8216;NoSQL&#8217;)</title>
		<link>http://www.lucidimagination.com/blog/2012/01/23/solrcloud-is-coming-and-looking-to-mix-in-even-more-nosql/</link>
		<comments>http://www.lucidimagination.com/blog/2012/01/23/solrcloud-is-coming-and-looking-to-mix-in-even-more-nosql/#comments</comments>
		<pubDate>Mon, 23 Jan 2012 14:40:19 +0000</pubDate>
		<dc:creator>Mark Miller</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=4626</guid>
		<description><![CDATA[<p>The second phase of SolrCloud has been in full swing for a couple of months now and it looks like we are going to be able to commit this work to trunk very soon! In Phase1 we built on top of Solr&#8217;s distributed search capabilities and added cluster state, central config, and built-in read side fault tolerance. Phase 2 is even more ambitious and focuses on the write side. We are talking full-blown fault tolerance for reads &#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>The second phase of SolrCloud has been in full swing for a couple of months now and it looks like we are going to be able to commit this work to trunk very soon! In Phase1 we built on top of Solr&#8217;s distributed search capabilities and added cluster state, central config, and built-in read side fault tolerance. Phase 2 is even more ambitious and focuses on the write side. We are talking full-blown fault tolerance for reads and writes, near real-time support, real-time GET, true single node durability,  optimistic locking, cluster elasticity, improvements to the Phase 1 features, and more.</p>
<p>Once we get Phase2 into trunk we will work on hardening and finishing a couple missing features &#8211; then SolrCloud should be ready to be part of the upcoming Lucene/Solr 4.0 release.</p>
<p>If you want to read more about SolrCloud and where we are with Phase 2, check out the new wiki page that we are working on at <a href="http://wiki.apache.org/solr/SolrCloud">http://wiki.apache.org/solr/SolrCloud</a> - feedback appreciated!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2012/01/23/solrcloud-is-coming-and-looking-to-mix-in-even-more-nosql/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Our Solr Reference Guide updated for v3.5</title>
		<link>http://www.lucidimagination.com/blog/2012/01/20/our-solr-reference-guide-updated-for-v3-5/</link>
		<comments>http://www.lucidimagination.com/blog/2012/01/20/our-solr-reference-guide-updated-for-v3-5/#comments</comments>
		<pubDate>Fri, 20 Jan 2012 18:49:03 +0000</pubDate>
		<dc:creator>Cassandra Targett</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=4619</guid>
		<description><![CDATA[<p>The Solr Reference Guide has been updated for the 3.5 release of Solr and Lucene. Only minor changes were needed this time around. In particular, we added information on:</p>
<ul>
<li>Support for the Hunspell stemmer</li>
<li>The new <code>langid</code> UpdateProcessor</li>
<li>Numeric types now support <code>sortMissingFirst/Last</code></li>
<li>New parameter <code>hl.q</code> for use with highlighting</li>
<li>Field types supported by the StatsComponent now includes date and string fields</li>
</ul>
<p>The Solr Reference Guide is available for free <a href="http://lucidworks.lucidimagination.com/display/solr">online</a> or as a <a href="http://www.lucidimagination.com/devzone/references/solr-guide">downloadable </a>&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>The Solr Reference Guide has been updated for the 3.5 release of Solr and Lucene. Only minor changes were needed this time around. In particular, we added information on:</p>
<ul>
<li>Support for the Hunspell stemmer</li>
<li>The new <code>langid</code> UpdateProcessor</li>
<li>Numeric types now support <code>sortMissingFirst/Last</code></li>
<li>New parameter <code>hl.q</code> for use with highlighting</li>
<li>Field types supported by the StatsComponent now includes date and string fields</li>
</ul>
<p>The Solr Reference Guide is available for free <a href="http://lucidworks.lucidimagination.com/display/solr">online</a> or as a <a href="http://www.lucidimagination.com/devzone/references/solr-guide">downloadable PDF</a>.  The 3.4 version of the Guide is available in <a href="http://www.lucidimagination.com/devzone/references/solr-guide">PDF only</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2012/01/20/our-solr-reference-guide-updated-for-v3-5/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Enhancing Discovery with Solr and Mahout &#8211; session slides now available!</title>
		<link>http://www.lucidimagination.com/blog/2012/01/09/enhancing-discovery-with-solr-and-mahout-apache-mahout-user-meeting/</link>
		<comments>http://www.lucidimagination.com/blog/2012/01/09/enhancing-discovery-with-solr-and-mahout-apache-mahout-user-meeting/#comments</comments>
		<pubDate>Mon, 09 Jan 2012 21:41:51 +0000</pubDate>
		<dc:creator>Ameena</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=4607</guid>
		<description><![CDATA[<p><strong>Date:</strong> Thursday, January 19, 2012<br />
<strong>Time: </strong>7:00 PM &#8211; 9:00 PM<br />
<strong>Location:</strong> 12200 Olympic Blvd, Los Angeles, CA</p>
<p>The latest Los Angeles/ OC Apache Lucene/Solr User group meeting was held at Shopzilla in LA. We had Grant Ingersoll from Lucid Imagination speaking at the event. In this talk, Grant spoke about some of the tools available (recommendations, faceting options, amongst others) in Solr and Mahout to aid in the discovery process and how these two &#8230;</p>]]></description>
			<content:encoded><![CDATA[<p><strong>Date:</strong> Thursday, January 19, 2012<br />
<strong>Time: </strong>7:00 PM &#8211; 9:00 PM<br />
<strong>Location:</strong> 12200 Olympic Blvd, Los Angeles, CA</p>
<p>The latest Los Angeles/ OC Apache Lucene/Solr User group meeting was held at Shopzilla in LA. We had Grant Ingersoll from Lucid Imagination speaking at the event. In this talk, Grant spoke about some of the tools available (recommendations, faceting options, amongst others) in Solr and Mahout to aid in the discovery process and how these two open source projects can work together.</p>
<p>Session slides are now available <a href="http://www.lucidimagination.com/devzone/events/meet-ups/enhancing-discovery-solr-and-mahout-Jan-19-2012-LA">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2012/01/09/enhancing-discovery-with-solr-and-mahout-apache-mahout-user-meeting/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Solr and LucidWorks feature matrix available</title>
		<link>http://www.lucidimagination.com/blog/2012/01/03/solr-and-lucidworks-feature-matrix-available/</link>
		<comments>http://www.lucidimagination.com/blog/2012/01/03/solr-and-lucidworks-feature-matrix-available/#comments</comments>
		<pubDate>Tue, 03 Jan 2012 21:51:08 +0000</pubDate>
		<dc:creator>Cassandra Targett</dc:creator>
				<category><![CDATA[LucidWorks]]></category>
		<category><![CDATA[Solr]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=4589</guid>
		<description><![CDATA[<p>We get asked a lot by customers what&#8217;s in a new Solr/Lucene release that applies to them, and with our own LucidWorks Platform available, customers naturally want to know what they&#8217;ll get that they don&#8217;t already have. If you&#8217;re happily running along on Solr 1.4, why or when should you update to a newer version? Should you migrate to LucidWorks?</p>
<p>So we decided to try to put together a matrix of major features and show &#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>We get asked a lot by customers what&#8217;s in a new Solr/Lucene release that applies to them, and with our own LucidWorks Platform available, customers naturally want to know what they&#8217;ll get that they don&#8217;t already have. If you&#8217;re happily running along on Solr 1.4, why or when should you update to a newer version? Should you migrate to LucidWorks?</p>
<p>So we decided to try to put together a matrix of major features and show in which versions they are available. Solr 1.4 is pretty old by now, so it naturally appears not to hold up well against Solr 3.5, Solr Trunk, or LucidWorks. Think of it as the base from which the later features in the list grow.</p>
<p>This was an interesting exercise to work through. It&#8217;s easy to read through the changes.txt for each release and try to include everything in a list such as this (and our Support guys are probably disappointed that I didn&#8217;t do that), but I tried to keep it to the major innovations or bug fixes so it stays somewhat readable. But there&#8217;s always the question of whether it&#8217;s too much or too little detail.</p>
<p>I hope it&#8217;s useful and we&#8217;d like to know what you think. Is it worthwhile? Should we go to deeper detail? Could the features use more explanation? Look it over at <a href="http://www.lucidimagination.com/devzone/references/feature-matrix-solr-and-lucidworks">Feature Matrix for Solr and LucidWorks</a> and share your suggestions in the comments.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2012/01/03/solr-and-lucidworks-feature-matrix-available/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>LucidWorks Enterprise latest version 2.0.1 released!</title>
		<link>http://www.lucidimagination.com/blog/2011/12/29/lucidworks-enterprise-latest-version-2-0-1-released/</link>
		<comments>http://www.lucidimagination.com/blog/2011/12/29/lucidworks-enterprise-latest-version-2-0-1-released/#comments</comments>
		<pubDate>Thu, 29 Dec 2011 16:35:50 +0000</pubDate>
		<dc:creator>Ameena</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=4577</guid>
		<description><![CDATA[<p>LucidWorks Enterprise 2.0.1 is an interim bug-fix release. We have resolved a couple of critical bugs and LDAP integration issues. The list of issues resolved with this updates are available <a href="http://lucidworks.lucidimagination.com/display/lweug/Changes+from+LucidWorks+v2.0+to+2.0.1">here</a>.</p>
<p><strong>Download</strong></p>
<p>You can download the latest version 2.0.1 <a href="http://www.lucidimagination.com/products/lucidworks-search-platform/enterprise">here</a>.</p>
<p><strong>Install </strong></p>
<p>If you are running LucidWorks Enterprise 1.7 or LucidWorks 1.8, you can use the <a href="http://lucidworks.lucidimagination.com/display/lweug/Migrating+from+a+Prior+Version">upgrade scripts</a> and move to version 2.0.1.</p>
<p>For those of you running LucidWorks Enterprise 2.0, you can now &#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>LucidWorks Enterprise 2.0.1 is an interim bug-fix release. We have resolved a couple of critical bugs and LDAP integration issues. The list of issues resolved with this updates are available <a href="http://lucidworks.lucidimagination.com/display/lweug/Changes+from+LucidWorks+v2.0+to+2.0.1">here</a>.</p>
<p><strong>Download</strong></p>
<p>You can download the latest version 2.0.1 <a href="http://www.lucidimagination.com/products/lucidworks-search-platform/enterprise">here</a>.</p>
<p><strong>Install </strong></p>
<p>If you are running LucidWorks Enterprise 1.7 or LucidWorks 1.8, you can use the <a href="http://lucidworks.lucidimagination.com/display/lweug/Migrating+from+a+Prior+Version">upgrade scripts</a> and move to version 2.0.1.</p>
<p>For those of you running LucidWorks Enterprise 2.0, you can now upgrade to LucidWorks Enterprise 2.0.1 by following the <a href="http://lucidworks.lucidimagination.com/display/lweug/Upgrade+Instructions+for+v2.0+to+v2.0.1">steps outlined here</a>. You can also find this file with your 2.0.1 (.jar and .zip) package as well.</p>
<p><strong>More Resources<br />
</strong></p>
<p>Please visit our <a href="http://lucidworks.lucidimagination.com/display/lweug/LucidWorks+Platform+Documentation">documentation page</a> for getting started and in-depth review of the product functionality.</p>
<p>You can participate in our <a href="http://www.lucidimagination.com/forum/">forums</a> and share your experiences, questions, and issues. We actively monitor our forums and respond back to help you with using LucidWorks search platform.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2011/12/29/lucidworks-enterprise-latest-version-2-0-1-released/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Dallas JavaMUG December 14th 2011</title>
		<link>http://www.lucidimagination.com/blog/2011/12/14/dallas-javamug-december-14th-2011/</link>
		<comments>http://www.lucidimagination.com/blog/2011/12/14/dallas-javamug-december-14th-2011/#comments</comments>
		<pubDate>Wed, 14 Dec 2011 15:53:39 +0000</pubDate>
		<dc:creator>Ameena</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=4561</guid>
		<description><![CDATA[<p>The next JavaMUG meeting is on December 14th 2011. Erik Hatcher from <a href="http://www.lucidimagination.com">Lucid Imagination</a> will be presenting at the event. He will talk about Apache Solr, its features and benefits. This will be an introductory Solr talk.</p>
<p>Apache <a href="http://lucene.apache.org/solr/">Solr</a> serves search requests at enterprises and the largest companies around the world. Built on top of the top–notch Apache <a href="http://lucene.apache.org/">Lucene</a> library, Solr makes indexing and searching integration into your applications straightforward.</p>
<p>Solr provides faceted navigation, spell &#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>The next JavaMUG meeting is on December 14th 2011. Erik Hatcher from <a href="http://www.lucidimagination.com">Lucid Imagination</a> will be presenting at the event. He will talk about Apache Solr, its features and benefits. This will be an introductory Solr talk.</p>
<p>Apache <a href="http://lucene.apache.org/solr/">Solr</a> serves search requests at enterprises and the largest companies around the world. Built on top of the top–notch Apache <a href="http://lucene.apache.org/">Lucene</a> library, Solr makes indexing and searching integration into your applications straightforward.</p>
<p>Solr provides faceted navigation, spell checking, highlighting, clustering, grouping, and other search features. Solr also scales query volume with replication and collection size with distributed capabilities. Solr can index rich documents such as PDF, Word, HTML, and other file types.</p>
<p>For more details on time and location, visit <a href="http://javamug.org/">http://javamug.org/</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2011/12/14/dallas-javamug-december-14th-2011/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What&#8217;s with lowercasing wildcard (multiterm) queries in Solr?</title>
		<link>http://www.lucidimagination.com/blog/2011/11/29/whats-with-lowercasing-wildcard-multiterm-queries-in-solr/</link>
		<comments>http://www.lucidimagination.com/blog/2011/11/29/whats-with-lowercasing-wildcard-multiterm-queries-in-solr/#comments</comments>
		<pubDate>Tue, 29 Nov 2011 21:37:25 +0000</pubDate>
		<dc:creator>Erick Erickson</dc:creator>
				<category><![CDATA[schema]]></category>
		<category><![CDATA[Solr]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[wildcards multiterm queryparser]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=4476</guid>
		<description><![CDATA[<h1><span class="Apple-style-span" style="font-size: 20px;">Wildcard query terms aren&#8217;t analyzed, why is that?</span></h1>
<p>Prior to the current 3x branch (which will be released as 3.6) and the trunk (4.0) Solr code, users have frequently been perplexed by wildcard searching being un-analyzed, often manifesting in case sensitivity. Say you have an analysis chain in your schema.xml file defined as follows and a field named <code>lc_field</code> of this type:</p>
<pre>&#60;fieldType name="lowercase" class="solr.TextField" &#62;
  &#60;tokenizer class="solr.WhitespaceTokenizerFactory"/&#62;
  &#60;filter class="solr.LowercaseFilterFactory" /&#62;
&#60;/fieldType&#62;
</pre>
<p>Now, you index &#8230;</p>]]></description>
			<content:encoded><![CDATA[<h1><span class="Apple-style-span" style="font-size: 20px;">Wildcard query terms aren&#8217;t analyzed, why is that?</span></h1>
<p>Prior to the current 3x branch (which will be released as 3.6) and the trunk (4.0) Solr code, users have frequently been perplexed by wildcard searching being un-analyzed, often manifesting in case sensitivity. Say you have an analysis chain in your schema.xml file defined as follows and a field named <code>lc_field</code> of this type:</p>
<pre>&lt;fieldType name="lowercase" class="solr.TextField" &gt;
  &lt;tokenizer class="solr.WhitespaceTokenizerFactory"/&gt;
  &lt;filter class="solr.LowercaseFilterFactory" /&gt;
&lt;/fieldType&gt;
</pre>
<p>Now, you index the text &#8220;My Dog Has Fleas&#8221;. So far, so good. Searching on this field as<br />
<code>field_lc:fleas</code> returns the document, as does <code>field_lc:flea*</code>.</p>
<p>But now you search on <code>field_lc:Flea*</code> and you don&#8217;t get any results. What?!?!?! Nearly everyone scratches their heads about this, and it&#8217;s a question that often comes up on the Solr user&#8217;s list. Users wonder why the analysis chain above isn&#8217;t applied to the wildcard queries. It turns out that it&#8217;s trickier than you might think at first. What happens when a single input term gets split up into multiple parts? For instance, for those of you familiar with WordDelimiterFilterFactory (WDDF) that can split on case change. What does it mean to parse &#8216;fleA*&#8217;? Applying WDDF might well give the two tokens &#8216;fle&#8217; and &#8216;A&#8217; and possibly &#8216;fleA&#8217;. If a wildcard is present, what tokens should be emitted?</p>
<ol>
<ol>
<li>&#8216;fleA*&#8217;</li>
<li>&#8216;fle*&#8217;, &#8216;A*&#8217;, &#8216;fleA*&#8217;</li>
<li>&#8216;fle*&#8217;, &#8216;A*&#8217;</li>
<li>&lt;insert your solution here&gt;</li>
</ol>
</ol>
<p>You can, I daresay, create any rule that suits your fancy. And it&#8217;ll be wrong in some situations. Of particular horror is anything that produces &#8216;A*&#8217; as above, conceptually, you&#8217;d than have an enormous OR clause consisting of all the terms that started with &#8216;A&#8217; in your index. Unless you had a rule like &#8220;only do this if the preceding fragment was 2 characters or more&#8221;. But then someone would say &#8220;I need three characters&#8221;, so can WDDF provide a &#8220;wildCardMin=#&#8221; parameter? I have trouble keeping all the parameters with WDDF and how they interact in my mind already, going down this path would be a nightmare. And I haven&#8217;t even considered some of the <strong>really</strong> interesting issues, like how proximity would be incorporated in all this.</p>
<h3>Wildcards aren&#8217;t the only issue</h3>
<p>The same issue occurs with accent folding, normalizations, and, really, any other component of an analysis chain that somehow changes the query terms. This behavior has mostly been ignored in releases past, it&#8217;s been up to the application programmer to manually &#8220;do the right thing&#8221; before sending the query to Solr. This often involves operations such as lower-casing and accent folding on the application side when a wildcard is encountered.</p>
<h1>The new way of handling these cases</h1>
<p>As of <a title="SOLR-2438" href="https://issues.apache.org/jira/browse/SOLR-2438">SOLR-2438</a> this behavior is no longer true for a number of the most common cases. A query analysis chain that contains any of the following components will automatically &#8220;do the right thing&#8221; and apply them for multi-term queries. If your analysis chain consists of any of these elements, and you want them applied to &#8220;multi-term&#8221; queries, you don&#8217;t have to do anything at all, it will &#8220;just work&#8221;. At query time, the indicated transformations are applied to the query terms and everyone is happy. Or should be. Do note that it&#8217;s an all-or-nothing operation. <strong>All</strong> of the elements below that are found in the query analysis chain are applied to the multi-term terms.</p>
<ul>
<ul>
<li>ASCIIFoldingFilterFactory</li>
<li>LowerCaseFilterFactory</li>
<li>LowerCaseTokenizerFactory</li>
<li>MappingCharFilterFactory</li>
<li>PersianCharFilterFactory</li>
</ul>
</ul>
<p>Again, this effectively means you don&#8217;t need to care about these transformations any more. One note of explanation, though. I&#8217;ve talked about the &#8220;query analysis chain&#8221;. But what if you don&#8217;t have one? Remember that your <code>&lt;analyzer&gt;</code> tag can have several possible &#8216;type&#8217; parameters; &#8220;index&#8221;, or &#8220;query&#8221;, or none. Well, if a &#8216; type=&#8221;query&#8221; &#8216; is found, that analysis chain is inspected and any of the above components are recorded to be used on multi-term queries. If no &#8216; type=&#8221;query&#8221; &#8216; is found, then the &#8216; type=&#8221;index&#8221; &#8216; is used. And if no &#8216; type=&#8221;index&#8221; &#8216; is found, than the one with no &#8216;type&#8217; parameter is used.</p>
<h2>What does &#8220;multi-term&#8221; mean anyway?</h2>
<p>I&#8217;ve also sprinkled the phrase &#8220;mult-term&#8221; around, and sometimes &#8220;wildcard&#8221;. It turns out that the simple wildcard case is a specialization of a broader category of queries, including:</p>
<ul>
<ul>
<li>wildcard</li>
<li>range</li>
<li>prefix</li>
</ul>
</ul>
<p>All of these are now handled as above.</p>
<h3>Expert level schema possibilities</h3>
<p>All of the above is automatic, but there are three immediate questions:</p>
<ul>
<ul>
<li>what about some of the <em>other</em> components?</li>
<li>what if I need the old behavior?</li>
<li>what if I want something completely different?</li>
</ul>
</ul>
<p>It turns out that all three of these questions have the same answer. But before I outline it, I want to emphasize that <strong>you very probably don&#8217;t need to care about what follows!</strong> You might need to know about this in special cases, so I&#8217;ll mention it here.</p>
<p>In the above explanations, I wrote that &#8220;analysis chain is inspected and any of the above components are recorded to be used on multi-term queries&#8221;. Well, what actually happens is that there&#8217;s a new analysis chain in town that can be specified in the schema.xml file called, you guessed it, &#8220;multiterm&#8221;. You specify it like this as part of a <code>&lt;fieldType&gt;</code>:</p>
<pre>
&lt;analyzer type="multiterm" &gt;
  &lt;tokenizer class="solr.WhitespaceTokenizerFactory"/&gt;
  &lt;filter class="solr.ASCIIFoldingFilterFactory" /&gt;
  &lt;filter class="solr.YourFavoriteFilterFactoryHere" /&gt;
&lt;/analyzer&gt;
</pre>
<p>You can put <em>any</em> component that&#8217;s legal in a &#8216;type=&#8221;index&#8221; &#8216; or &#8216;type=&#8221;query&#8221; &#8216; analysis chain. If you wanted, for instance, to enforce the old-style behavior, you could specify</p>
<pre>  &lt;tokenizer class="solr.KeywordTokenizerFactory" /&gt;</pre>
<p>as the entire &#8220;multiterm&#8221; analysis chain. It seems a bit odd to use KeywordTokenizerFactory here, but this applies to the individual terms, not the entire input. So it&#8217;s in effect saying &#8220;don&#8217;t analyze the terms at all&#8221;. Sound familiar? This is just what happened historically.</p>
<h3>How does this relate to the automatic behavior?</h3>
<p>Well, what really happens under the covers is that if you don&#8217;t define your own &#8220;multiterm&#8221; analysis chain, Solr constructs one for you from the analyzers you <em>have</em> defined as outlined above; query, index or default, in that order.</p>
<h2>Waaaaay under the covers, down in the code</h2>
<p>All this is accomplished by making components &#8220;multiterm aware&#8221;. This means implementing the &#8220;MultiTermAwareComponent&#8221; interface. Currently, the components listed above are the only ones that implement this interface, but others may be good candidates, and some of these are listed in JIRA <a title="SOLR-2921" href="https://issues.apache.org/jira/browse/SOLR-2921">SOLR-2921</a>. By and large, implementing these in the code <em>may</em> be trivial. What&#8217;s <em>not</em> trivial is understanding what &#8220;the right thing&#8221; is. Some examples:</p>
<ul>
<ul>
<li>stemmers</li>
<li>various language-specific normalization filters</li>
<li>various language-specific lowercase filters.</li>
<li>various ICU filters</li>
</ul>
</ul>
<p>The reason these haven&#8217;t been made &#8220;multi term aware&#8221; is the usual open-source reason; &#8220;What we have is a good step forward, we shouldn&#8217;t delay everything in order to get the last use cases taken care of&#8221;. In other words the implementors (me in this case, with lots of help from others) are tired <img src='http://www.lucidimagination.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> .</p>
<p>Anyone who really understands what the right thing to do in the cases of components that do not yet implement &#8220;MultiTermAwareComponent&#8221; and could provide use cases for them would be giving us a great help, especially by providing examples illustrating the correct inputs and outputs for wildcard cases. And some examples of what should <em>not</em> come out as well. Or even better, a draft JUnit test that would show the expected behavior. Or even better yet, a full patch!</p>
<p>Any modification that potentially produces more than one token needs to be handled with care, see the code for LowerCaseTokenizerFactory for a case in point. Consider that Solr will now throw an exception if the transformation produces more than one token, so tread cautiously!</p>
<p>This change should remove a long-standing point of confusion for solr users. We&#8217;d be very interested in any feedback from the community, and especially any problems that crop up. SOLR-2438 has patches for both the 3x and 4x code lines, but it&#8217;s probably easier just to get a current 3x or 4x branch (or nightly build) if you want to test this &#8220;in the wild&#8221;; the code has been committed and built. There remains some work to be done to incorporate this change for more analysis components, anyone want to volunteer?</p>
<h2>Resources:</h2>
<p>This page on the Solr Wiki has the Wiki documentation: <a title="Multi Term Query Analysis" href="http://wiki.apache.org/solr/MultitermQueryAnalysis">Multi Term Query Analysis</a></p>
<p>Main JIRA (already in 3.6 and 4.0 code lines): <a title="SOLR-2438" href="https://issues.apache.org/jira/browse/SOLR-2438">SOLR-2438</a></p>
<p>JIRA for other components not yet &#8220;multi-term aware&#8221; that are possibilities in the future: <a title="SOLR-2921" href="https://issues.apache.org/jira/browse/SOLR-2921">SOLR-2921</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2011/11/29/whats-with-lowercasing-wildcard-multiterm-queries-in-solr/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Lucene/Solr 3.5 Released</title>
		<link>http://www.lucidimagination.com/blog/2011/11/28/lucenesolr-3-5-released/</link>
		<comments>http://www.lucidimagination.com/blog/2011/11/28/lucenesolr-3-5-released/#comments</comments>
		<pubDate>Mon, 28 Nov 2011 14:24:35 +0000</pubDate>
		<dc:creator>Mark Miller</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=4486</guid>
		<description><![CDATA[<p>Official release announcement for Lucene/Solr 3.5:</p>
<h3><em>November 27 2011,</em> <strong>Apache Lucene™ 3.5.0 available</strong></h3>
<p>&#160;</p>
<p>The Lucene PMC is pleased to announce the release of Apache Lucene 3.5.0.</p>
<p>&#160;</p>
<p>Apache Lucene is a high-performance, full-featured text search engine</p>
<p>library written entirely in Java. It is a technology suitable for nearly</p>
<p>any application that requires full-text search, especially cross-platform.</p>
<p>&#160;</p>
<p>This release contains numerous bug fixes, optimizations, and</p>
<p>improvements, some of which are highlighted below.  The release&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>Official release announcement for Lucene/Solr 3.5:</p>
<h3><em>November 27 2011,</em> <strong>Apache Lucene™ 3.5.0 available</strong></h3>
<p>&nbsp;</p>
<p>The Lucene PMC is pleased to announce the release of Apache Lucene 3.5.0.</p>
<p>&nbsp;</p>
<p>Apache Lucene is a high-performance, full-featured text search engine</p>
<p>library written entirely in Java. It is a technology suitable for nearly</p>
<p>any application that requires full-text search, especially cross-platform.</p>
<p>&nbsp;</p>
<p>This release contains numerous bug fixes, optimizations, and</p>
<p>improvements, some of which are highlighted below.  The release</p>
<p>is available for immediate download at:</p>
<p>&nbsp;</p>
<p><a href="http://www.apache.org/dyn/closer.cgi/lucene/java">http://www.apache.org/dyn/closer.cgi/lucene/java</a> (see note below).</p>
<p>&nbsp;</p>
<p>See the CHANGES.txt file included with the release for a full list of</p>
<p>details.</p>
<p>&nbsp;</p>
<p><strong>Lucene 3.5.0 Release Highlights:</strong></p>
<p>&nbsp;</p>
<p>* Added a very substantial (3-5X) RAM reduction required to hold the</p>
<p>terms index on opening an IndexReader. (LUCENE-2205)</p>
<p>&nbsp;</p>
<p>* Added IndexSearcher.searchAfter which returns results after a</p>
<p>specified ScoreDoc (e.g. last document on the previous page) to</p>
<p>support deep paging use cases. (LUCENE-2215)</p>
<p>&nbsp;</p>
<p>* Added SearcherManager to manage sharing and reopening IndexSearchers</p>
<p>across multiple search threads. Underlying IndexReader instances are</p>
<p>safely closed if not referenced anymore. (LUCENE-3445, LUCENE-3558)</p>
<p>&nbsp;</p>
<p>* Added SearcherLifetimeManager which safely provides a consistent</p>
<p>view of the index across multiple requests (e.g. paging/drilldown).</p>
<p>(LUCENE-3558, LUCENE-3486)</p>
<p>&nbsp;</p>
<p>* Renamed IndexWriter.optimize to forceMerge to discourage use of</p>
<p>this method since it is horribly costly and rarely justified</p>
<p>anymore. (LUCENE-3439)</p>
<p>&nbsp;</p>
<p>* Added NGramPhraseQuery that speeds up phrase queries 30-50%</p>
<p>when n-gram analysis is used. (LUCENE-3426)</p>
<p>&nbsp;</p>
<p>* Added a new reopen API (IndexReader.openIfChanged) that</p>
<p>returns null instead of the old reader if there are no changes</p>
<p>in the index. (LUCENE-3464)</p>
<p>&nbsp;</p>
<p>* Improvements to vector highlighting: support for more queries</p>
<p>such as wildcards and boundary analysis for generated snippets</p>
<p>(LUCENE-1824, LUCENE-1889)</p>
<p>&nbsp;</p>
<p>* IndexSearcher and IndexReader now perform additional checks to</p>
<p>throw AlreadyClosedExceptions if searches are performed on a</p>
<p>closed IndexReader. Performing searches on already closed reader</p>
<p>can cause JVM crashes when invalid memory mapped files are</p>
<p>referenced.</p>
<p>&nbsp;</p>
<p>* Several bugfixes, including a bug where closing an NRT reader</p>
<p>after the writer was closed was incorrectly invoking the</p>
<p>DeletionPolicy. See CHANGES.txt entries for full details.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<h3><em>27 November 2011,</em> <strong>Apache Solr™ 3.5.0 available</strong></h3>
<p>The Lucene PMC is pleased to announce the release of Apache Solr 3.5.0.</p>
<p>&nbsp;</p>
<p>Solr is the popular, blazing fast open source enterprise search platform from</p>
<p>the Apache Lucene project. Its major features include powerful full-text</p>
<p>search, hit highlighting, faceted search, dynamic clustering, database</p>
<p>integration, rich document (e.g., Word, PDF) handling, and geospatial search.</p>
<p>Solr is highly scalable, providing distributed search and index replication,</p>
<p>and it powers the search and navigation features of many of the world&#8217;s</p>
<p>largest internet sites.</p>
<p>&nbsp;</p>
<p>This release contains numerous bug fixes, optimizations, and</p>
<p>improvements, some of which are highlighted below.  The release</p>
<p>is available for immediate download at:</p>
<p><a href="http://www.apache.org/dyn/closer.cgi/lucene/solr">http://www.apache.org/dyn/closer.cgi/lucene/solr</a> (see note below).</p>
<p>&nbsp;</p>
<p>See the CHANGES.txt file included with the release for a full list of</p>
<p>details.</p>
<p>&nbsp;</p>
<p><strong>Solr 3.5.0 Release Highlights:</strong></p>
<p>&nbsp;</p>
<p>* Bug fixes and improvements from Apache Lucene 3.5.0, including a</p>
<p>very substantial (3-5X) RAM reduction required to hold the terms</p>
<p>index on opening an IndexReader. (LUCENE-2205)</p>
<p>&nbsp;</p>
<p>* Added support for distributed result grouping. (SOLR-2066,</p>
<p>SOLR-2776)</p>
<p>&nbsp;</p>
<p>* Added support for Hunspell stemmer TokenFilter supporting stemming</p>
<p>for 99 languages. (SOLR-2769)</p>
<p>&nbsp;</p>
<p>* A new contrib module &#8220;langid&#8221; adds language identification</p>
<p>capabilities as an Update Processor, using Tika&#8217;s</p>
<p>LanguageIdentifier or Cybozu language-detection library (SOLR-1979)</p>
<p>&nbsp;</p>
<p>* Numeric types including Trie and date types now support</p>
<p>sortMissingFirst/Last. (SOLR-2881)</p>
<p>&nbsp;</p>
<p>* Added hl.q parameter. It is optional and if it is specified, it overrides</p>
<p>q parameter in Highlighter. (SOLR-1926)</p>
<p>&nbsp;</p>
<p>* Several minor bugfixes like date parsing for years from 0001-1000, ignored</p>
<p>configurations when using QueryAnalyzer with SpellCheckComponent</p>
<p>and many more.</p>
<p>See CHANGES.txt entries for full details.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>Note: The Apache Software Foundation uses an extensive mirroring network for</p>
<p>distributing releases.  It is possible that the mirror you are using may not</p>
<p>have replicated the release yet.  If that is the case, please try another</p>
<p>mirror.  This also goes for Maven access.</p>
<p>&nbsp;</p>
<p>Happy searching,</p>
<p>&nbsp;</p>
<p>Apache Lucene/Solr Developers</p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2011/11/28/lucenesolr-3-5-released/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Solr Reference Guide 3.4 available!</title>
		<link>http://www.lucidimagination.com/blog/2011/11/21/solr-reference-guide-3-4-available/</link>
		<comments>http://www.lucidimagination.com/blog/2011/11/21/solr-reference-guide-3-4-available/#comments</comments>
		<pubDate>Mon, 21 Nov 2011 20:30:49 +0000</pubDate>
		<dc:creator>Ameena</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ref guide]]></category>
		<category><![CDATA[solr guide]]></category>
		<category><![CDATA[Solr ref]]></category>
		<category><![CDATA[solr ref guide]]></category>
		<category><![CDATA[Solr reference guide]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=4477</guid>
		<description><![CDATA[<p>Solr Reference Guide version 3.4 is now available. The Reference Guide is designed to provide descriptions of  all the important feature and functions of the LucidWorks for Solr Certified Distribution. You can either <a href="http://lucidworks.lucidimagination.com/display/solr/Apache+Solr+Reference+Guide">view it online</a> or <a href="http://www.lucidimagination.com/devzone/references/solr-guide">download it</a>. It will be of use at any point in the application lifecycle, whether you needed detailed information about Solr or you are just getting started.&#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>Solr Reference Guide version 3.4 is now available. The Reference Guide is designed to provide descriptions of  all the important feature and functions of the LucidWorks for Solr Certified Distribution. You can either <a href="http://lucidworks.lucidimagination.com/display/solr/Apache+Solr+Reference+Guide">view it online</a> or <a href="http://www.lucidimagination.com/devzone/references/solr-guide">download it</a>. It will be of use at any point in the application lifecycle, whether you needed detailed information about Solr or you are just getting started.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2011/11/21/solr-reference-guide-3-4-available/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

