<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Lucid Imagination &#187; spatial search</title>
	<atom:link href="http://www.lucidimagination.com/blog/category/spatial-search/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.lucidimagination.com/blog</link>
	<description>Exclusively dedicated to Apache Lucene/Solr open source search technology</description>
	<lastBuildDate>Sat, 04 Feb 2012 01:12:03 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Solr Result Grouping / Field Collapsing</title>
		<link>http://www.lucidimagination.com/blog/2010/09/16/2446/</link>
		<comments>http://www.lucidimagination.com/blog/2010/09/16/2446/#comments</comments>
		<pubDate>Fri, 17 Sep 2010 01:52:26 +0000</pubDate>
		<dc:creator>yonik</dc:creator>
				<category><![CDATA[Enterprise Search]]></category>
		<category><![CDATA[functions]]></category>
		<category><![CDATA[Solr]]></category>
		<category><![CDATA[spatial search]]></category>
		<category><![CDATA[field collapsing]]></category>
		<category><![CDATA[function query]]></category>
		<category><![CDATA[geo search]]></category>
		<category><![CDATA[result grouping]]></category>
		<category><![CDATA[solr 4.0]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=2446</guid>
		<description><![CDATA[<p><strong>Result Grouping</strong>, also called <strong>Field Collapsing</strong>, has been committed to Solr!<br />
This functionality limits the number of documents for each &#8220;group&#8221;, usually defined by the unique values in a field (just like field faceting).</p>
<p>You can think of it like faceted search, except instead of just getting a count, you get the top documents for that constraint or category.  There are tons of potential use cases:</p>
<ul>
<li>For web search, only show 1 or </li>&#8230;</ul>]]></description>
			<content:encoded><![CDATA[<p><strong>Result Grouping</strong>, also called <strong>Field Collapsing</strong>, has been committed to Solr!<br />
This functionality limits the number of documents for each &#8220;group&#8221;, usually defined by the unique values in a field (just like field faceting).</p>
<p>You can think of it like faceted search, except instead of just getting a count, you get the top documents for that constraint or category.  There are tons of potential use cases:</p>
<ul>
<li>For web search, only show 1 or 2 results for a given website by collapsing on a site field.</li>
<li>For email search, only show 1 or 2 results for a given email thread</li>
<li>For e-commerce, show the top 3 products for each store category (i.e. &#8220;electronics&#8221;, &#8220;housewares&#8221;)</li>
<li>Hiding duplicate documents at query time.</li>
</ul>
<p>In addition to being able to group by the values of a field, you can also group by the values of a function query.  Given that geo search works as a function query, this also opens up possibilities for showing top query matches within 1 mile, between 1 and 2 miles, etc.</p>
<p>Just like faceting, we&#8217;ll be adding new functionality and making continual improvements.<br />
Result Grouping is documented on the <a href="http://wiki.apache.org/solr/FieldCollapsing">Solr Wiki</a>, and you will need a recent<br />
<a href="http://wiki.apache.org/solr/FrontPage#solr_development">nightly build</a> of Solr 4.0-dev to try it out (just make sure it&#8217;s dated after this post).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2010/09/16/2446/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>[UPDATE] Spatial Search in Apache Lucene and Solr</title>
		<link>http://www.lucidimagination.com/blog/2010/07/20/update-spatial-search-in-apache-lucene-and-solr/</link>
		<comments>http://www.lucidimagination.com/blog/2010/07/20/update-spatial-search-in-apache-lucene-and-solr/#comments</comments>
		<pubDate>Tue, 20 Jul 2010 11:24:34 +0000</pubDate>
		<dc:creator>Grant Ingersoll</dc:creator>
				<category><![CDATA[Lucene]]></category>
		<category><![CDATA[Solr]]></category>
		<category><![CDATA[spatial search]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=2218</guid>
		<description><![CDATA[<p>One of the most frequent things I get asked is &#8220;what is the state of spatial in Lucene and Solr?&#8221;  So here is my answer as of today:</p>
<ol>
<li>I just committed <a href="https://issues.apache.org/jira/browse/SOLR-1568">SOLR-1568</a> the other day, which adds automatic filter generation to the various point based Field Types in Solr.  It also has some small refactoring in the underlying Lucene code.  Furthermore, it adds a new LatLonType which can be used to represent latitude/longitude pairs seamlessly.  </li>&#8230;</ol>]]></description>
			<content:encoded><![CDATA[<p>One of the most frequent things I get asked is &#8220;what is the state of spatial in Lucene and Solr?&#8221;  So here is my answer as of today:</p>
<ol>
<li>I just committed <a href="https://issues.apache.org/jira/browse/SOLR-1568">SOLR-1568</a> the other day, which adds automatic filter generation to the various point based Field Types in Solr.  It also has some small refactoring in the underlying Lucene code.  Furthermore, it adds a new LatLonType which can be used to represent latitude/longitude pairs seamlessly.  See <a href="http://wiki.apache.org/solr/SpatialSearch">http://wiki.apache.org/solr/SpatialSearch</a> for the full details on Solr spatial.  Note, this is only available on trunk.  Volunteers to backport to 3.x would be most welcome.</li>
<li>As part of SOLR-1568, it became increasingly clear to me that the Cartesian Tier stuff in Lucene spatial simply does not work as intended for many, many things.  In my review and attempt at fixing the code, it became more than apparent that it only really works for the Western Hemisphere above the equator, i.e. the United States.  It may also work in the Eastern Hemisphere above the equator, too.  The reason it only really works above the equator is due to a miscalculation in the SinusoidalProjector.  See <a href="https://issues.apache.org/jira/browse/LUCENE-2519">LUCENE-2519</a>.  It also does not handle edge cases well at all, such as at the poles or the Prime/Anit Meridians, so if you have that case, then don&#8217;t bother.  I didn&#8217;t fix the SinusoidalProjector because it turned into a very tangled web of broken unit tests.  In <a href="http://www.lucidimagination.com/search/document/c32e81783642df47/spatial_rethinking_cartesian_tiers_implementation">discussions with other developers</a>, we decided the whole tier system (and much of Lucene&#8217;s spatial should be deprecated/replaced).</li>
</ol>
<p>I believe trunk is now in pretty decent shape for spatial search for applications that need:</p>
<ol>
<li>Sorting by distance</li>
<li>Boosting by distance</li>
<li>Range-query (using Numeric Fields) based bounding box calculations, which should be sufficient for most people</li>
<li>Geohash based calculations</li>
</ol>
<p>Trunk does not yet have the ability to:</p>
<ol>
<li>add &#8220;pseudo&#8221; fields to the result set, so it is not possible to include the distance in the result set just like other stored fields</li>
<li>A tier/tile/grid based approach to filtering.  These approaches are especially helpful in highly dense areas as they can significantly reduce the number of terms that need to be enumerated</li>
<li>Faceting by functions, which can be useful for putting distances into buckets, as in something like: walking, biking, driving</li>
</ol>
<p>For a list of all the related Solr/Lucene spatial issues, see <a href="https://issues.apache.org/jira/browse/SOLR-773">SOLR-773</a>.  Again, see <a href="http://wiki.apache.org/solr/SpatialSearch">http://wiki.apache.org/solr/SpatialSearch</a> for a full accounting of what is in Solr and how to use it.</p>
<p>In summary, I think trunk is in pretty decent shape for spatial, as far as Solr is concerned.  Pure Lucene users will seem some upheaval in the coming months, but it is for the better.  Testers are needed and patches are welcome.  And, while the tier stuff feels like a step backward, I think it is clear to me that we have several committers along with many contributors who are very interested in seeing spatial support live and prosper.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2010/07/20/update-spatial-search-in-apache-lucene-and-solr/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>State of Spatial Support in Apache Solr</title>
		<link>http://www.lucidimagination.com/blog/2010/03/10/state-of-spatial-support-in-apache-solr/</link>
		<comments>http://www.lucidimagination.com/blog/2010/03/10/state-of-spatial-support-in-apache-solr/#comments</comments>
		<pubDate>Wed, 10 Mar 2010 20:53:09 +0000</pubDate>
		<dc:creator>Grant Ingersoll</dc:creator>
				<category><![CDATA[Lucene]]></category>
		<category><![CDATA[Solr]]></category>
		<category><![CDATA[spatial search]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=1846</guid>
		<description><![CDATA[<p>I&#8217;ve had quite a few people asking me about the state of geospatial support in Apache Solr lately, so I thought I would give a brief update here.</p>
<p>Much of the functionality behind <a href="https://issues.apache.org/jira/browse/SOLR-773">SOLR-773</a> is now implemented in the trunk version of Solr and is available for <a href="http://lucene.apache.org/solr/version_control.html">check out</a>.  This includes support for several different distance measures (Euclidean, Haversine, etc.) as well as support for sorting by functions (aka sorting by distance).  Due note &#8230;</p>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve had quite a few people asking me about the state of geospatial support in Apache Solr lately, so I thought I would give a brief update here.</p>
<p>Much of the functionality behind <a href="https://issues.apache.org/jira/browse/SOLR-773">SOLR-773</a> is now implemented in the trunk version of Solr and is available for <a href="http://lucene.apache.org/solr/version_control.html">check out</a>.  This includes support for several different distance measures (Euclidean, Haversine, etc.) as well as support for sorting by functions (aka sorting by distance).  Due note there are some minor issues left to fix on that one.  See <a href="https://issues.apache.org/jira/browse/SOLR-1297">SOLR-1297</a> for the gotchas there.  There is also support for several different point based field types now too.  See <a href="https://issues.apache.org/jira/browse/SOLR-1131">SOLR-1131</a> for more info.</p>
<p>Right now, I&#8217;m working on <a href="https://issues.apache.org/jira/browse/SOLR-1568">SOLR-1568</a>, which will add the last &#8220;major&#8221; piece of needed functionality: spatial filtering based on FieldType.  I&#8217;m getting close to putting up a patch for review, but it will then take a week or two more from there to iterate and commit.</p>
<p>Beyond that, there are some minor things that would be nice to have, but not showstoppers, I don&#8217;t think, for the basic spatial use cases (sort, boost, filter by distance.)  For those wanting deeper capabilities along the lines of shape intersections, that&#8217;s a bit farther off, unless of course, you have a patch!</p>
<p>As always, feedback welcome!</p>
<p><em>Related link: On-Demand webinar<br />
</em><strong><a href="http://theserversidecom.bitpipe.com/detail/RES/1257457967_42.html&amp;asrc=CL_PRM_Lucid_11_18_09_c&amp;li=252934">From Here to There, You Can Find it  Anywhere: Building  Local/Geo-Search with Apache Lucene and Solr</a></strong></p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2010/03/10/state-of-spatial-support-in-apache-solr/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Apache Solr 1.5 on the move with more &#8220;functionality&#8221;</title>
		<link>http://www.lucidimagination.com/blog/2009/12/12/apache-solr-1-5-on-the-move-with-more-functionality/</link>
		<comments>http://www.lucidimagination.com/blog/2009/12/12/apache-solr-1-5-on-the-move-with-more-functionality/#comments</comments>
		<pubDate>Sat, 12 Dec 2009 23:45:26 +0000</pubDate>
		<dc:creator>Grant Ingersoll</dc:creator>
				<category><![CDATA[apache]]></category>
		<category><![CDATA[functions]]></category>
		<category><![CDATA[Lucene]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Payloads]]></category>
		<category><![CDATA[Solr]]></category>
		<category><![CDATA[spatial search]]></category>
		<category><![CDATA[ZooKeeper]]></category>

		<guid isPermaLink="false">http://www.lucidimagination.com/blog/?p=1404</guid>
		<description><![CDATA[<p>The paint is barely dry on <a href="http://lucene.apache.org/solr">Apache Solr</a> 1.4 and the community is already on the move for Solr 1.5 (which may actually be Solr 2.0, but for now let&#8217;s call it 1.5).</p>
<p>I&#8217;m particularly excited about a few things:</p>
<ol>
<li>Massive scalability capabilities via distributed search, indexing and shard management &#8211; Up until now, Solr scales pretty well on the search side (I&#8217;ve seen billion+ document instances and we&#8217;ve benchmarked it at that level too), </li>&#8230;</ol>]]></description>
			<content:encoded><![CDATA[<p>The paint is barely dry on <a href="http://lucene.apache.org/solr">Apache Solr</a> 1.4 and the community is already on the move for Solr 1.5 (which may actually be Solr 2.0, but for now let&#8217;s call it 1.5).</p>
<p>I&#8217;m particularly excited about a few things:</p>
<ol>
<li>Massive scalability capabilities via distributed search, indexing and shard management &#8211; Up until now, Solr scales pretty well on the search side (I&#8217;ve seen billion+ document instances and we&#8217;ve benchmarked it at that level too), but the work underway in Solr 1.5 will take it to a whole new level, thanks to the integration of Apache <a href="http://hadoop.apache.org/zookeeper/">ZooKeeper</a> and other distributed technologies.  For those interested, check out the &#8220;cloud&#8221; branch in <a href="https://svn.apache.org/repos/asf/lucene/solr/branches/cloud/">SVN</a>.</li>
<li>Functions, functions, functions!  We&#8217;ve already added a bunch of <a href="http://wiki.apache.org/solr/FunctionQuery">functions</a> (see my <a href="http://www.lucidimagination.com/blog/2009/11/20/fun-with-solr-functions/">earlier post</a>) and I see more on the horizon.  Additionally, I see great value in adding, for lack of a better phrase, aggregating functions to the mix (via <a href="https://issues.apache.org/jira/browse/SOLR-1622">SOLR-1622</a>).  This will allow application designers to do much more sophisticated math across a search result set than what is currently available via the StatsComponent.  In some ways, this can empower business intelligence applications on top of Solr (I realize it is just a small piece of the BI pie) as well as more sophisticated mathematical applications.</li>
<li>Spatial Search!  It&#8217;s funny, a lot of people want spatial search and Solr could have simply harnessed a really nice existing package (LocalSolr) just as many already do, but by stepping back and taking a look at spatial in the context of the bigger picture of things (see <a href="https://issues.apache.org/jira/browse/SOLR-773">SOLR-773</a>) that would be nice to have in Solr, the community will be able to not only implement spatial search (by leveraging key pieces of LocalSolr where appropriate), but will also get a whole bevy of other features, including:
<ol>
<li>Sort By Function &#8211; Instead of a one off that sorts solely by distance, why not enable Solr users to sort by any arbitrary function?  I just committed this tonight via <a href="https://issues.apache.org/jira/browse/SOLR-1297">SOLR-1297</a>.</li>
<li>&#8220;Poly&#8221; Field Types &#8211; Thanks to <a href="https://issues.apache.org/jira/browse/SOLR-1131">SOLR-1131</a>, Solr&#8217;s FieldType mechanism can be used to represent multiple underlying fields.  This is especially useful for representing things like points in an <em>n</em>-dimensional space, Cartesian Tiers (zoom levels) and other cool things.  Moreover, it shows the types of abstractions Solr can overlay on the already powerful Apache Lucene to provide even more functionality.</li>
<li>Facet By Function &#8211; Sure, it&#8217;s great to put your distances into buckets, but why not put the result of any function into buckets?  See <a href="https://issues.apache.org/jira/browse/SOLR-1581">SOLR-1581</a>.</li>
<li>Spatial Query Parsers &#8211; aka geocoding &#8211; Parse things like street addresses, etc. and get back appropriate Query instances. See <a href="https://issues.apache.org/jira/browse/SOLR-1568">SOLR-1568</a> and <a href="https://issues.apache.org/jira/browse/SOLR-1578">SOLR-1578.</a></li>
<li>Several different distance functions, including haversine (great circle), Manhattan, Euclidean (Solr actually now supports all p-norms as distance functions.)  See <a href="https://issues.apache.org/jira/browse/SOLR-1302">SOLR-1302</a>.  I even added in the ability to do String distance calculations using Levenstein (edit), Jaro-Winkler, n-gram (basically all of the Lucene spellchecker distance measures, as well as any user defined String Distance calculation.</li>
<li>&#8220;pseudo&#8221; fields &#8211; Instead of just hacking the ability to put a distance calculation into the result, why not allow the response to stream out &#8220;fields&#8221; based on things like functions or other user defined values?  See <a href="https://issues.apache.org/jira/browse/SOLR-1298">SOLR-1298</a>.</li>
</ol>
</li>
<li>Field Collapsing &#8211; I haven&#8217;t had time to work on it, but I suspect Field Collapsing will finally make it into 1.5.  Field Collapsing allows Solr to &#8220;roll-up&#8221; similar results much like you see on many Internet search sites that indent results from the same domain.</li>
<li>Payload and Span Query support &#8211; Solr&#8217;s been able to index payloads for some time now, but it still requires a user to hook in their own query parser support.  It would also be really great to see functions that can work on payloads, too. See <a href="https://issues.apache.org/jira/browse/SOLR-1337">SOLR-1337</a> and <a href="https://issues.apache.org/jira/browse/SOLR-1485">SOLR-1485</a>.</li>
</ol>
<p>Of course, as I always say, &#8220;in open source, you never know where the next good idea is going to come from&#8221;, so I have total faith that the Solr community will come up with a plethora of other great new features, as well as the usual bug fixes, etc.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.lucidimagination.com/blog/2009/12/12/apache-solr-1-5-on-the-move-with-more-functionality/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

