• Products
    • Overview
    • LucidWorks Search Platform
      • Features and Benefits
      • Technical Overview
      • Only with LucidWorks
      • LucidWorks and Solr
      • White Papers
      • LucidWorks Enterprise
      • LucidWorks Cloud
    • Certified Distributions
      • Certified Solr
      • Certified Lucene
    • Apache Releases
      • Apache Solr
      • Apache Lucene
  • Support & Services
    • Overview
    • Support
    • Training
    • Solr/Lucene Certification
    • ExpertLink Advisory
    • Consulting
    • Partners
    • Subscriptions
  • Why Lucid?
    • Why Lucid?
    • Technology
    • Technical Leadership
    • Who uses Lucene/Solr?
      • What customers are saying
    • Case Studies
    • Whitepapers
    • Demos
    • Webinars
  • Blog
  • DevZone
    • DevZone Overview
    • Forums (LWE)
    • Videos & Podcasts
      • How To's
      • Screencasts
      • Podcasts
      • Conference Videos
    • Technical Articles
      • Whitepapers
    • Reference Materials
      • Documentation
      • Solr Reference Guide
      • Solr & LucidWorks Matrix
      • Tutorials
    • Events
      • Conferences
      • Meet Ups
    • Code & Test
  • Downloads
  • About Us
    • Management
    • Careers
    • News
      • Media Coverage
      • Press Releases
    • Contact Us
Sign Up or Log In
Home . Blog

July 1, 2009

Virtual words, real data

Posted by David M. Fishman

As virtualization and cloud computing buzz louder, Lucene/Solr open source search is adding a vibe of its own — most recently, with our announcement of our strategic partnership with ISYS technologies. A couple of weeks ago, Business Week wrote up how cloud computing will change business; and in between discussions of VMWare and Amazon’s EC2, tucked in a reference to Xoopit, “[a startup that] has built a specialized search engine capable of finding bits of information scattered among e-mail systems, sales management programs, blogs, and online news sites.” The cool part, not-so-secret? Xoopit is built with Lucene, delivering hosted search services; I met Bijan Marashi, Xoopit’s CEO, at the San Francisco Bay Area Lucene/Solr Meetup a few weeks back. Cloud-based apps that don’t have search yet can get it with Xoopit; their cool mail-search service means you don’t have to wonder what folder you put that email in.

A key attribute of virtualization is what you don’t have to deal with: who cares what disk drives or device drivers EC2 uses? Despite the glories of Unix and Linux, countless sysadmins have marched to their virtual deaths trying to solve low-level interface problems that only subtract value. Sure, that zippy new drive could be speedy, but trying to match drivers and firmware levels in an array of disks? (A guy I worked with who ran large-scale database performance benchmarks would without a file system for huge databases, because he had memorized the names of hundreds of disks and the data each one held — which, owing to the nature of benchmarking, did not change over time. Mercy!) I want storage service, not an exercise in storage anatomy. With virtualization, ta-da! VMWare? Same difference (though doing VMWare doesn’t mean you’re doing cloud, as Robert Scheier points out in InfoWorld). A big part of what makes it useful is all the things you can stop caring about.

When it comes to finding that file, or that email, or that record, the ISYS File Readers do just that — they read the content. Sure, there are plenty of us (and we know who we are) that have labored mightily over fonts, footers, and formats. The bad news is, great fonts are not what make great content. Liberating content from its gilded cage of format is a useful  proposition, so much so that many commercial enterprise search vendors charge a bundle for their “connectors”. Device drivers are useful, too; but what really unlocks the value is the service they enable, and the applications built on that service. Combining Lucene/Solr with ISYS File Readers creates a powerful service that can overcome the underlying variations and present content and information as a uniform resource.

Put another way, each format and data storage type has its own set of structures and interfaces — unstandardized. Search with Lucene/Solr virtualizes the data, and creates a standard set of interfaces for operating on it. Once you process your content — in any of dozens of different formats and containers — with the ISYS File Readers, it no longer matter where it is or what it is in — any more than it matters what device driver is on the disk drive in that file system that runs that database somewhere in the EC2 cloud.

  • Share this:
  • Email
  • Facebook
  • Digg
  • Share
  • Print
  • Reddit
  • StumbleUpon

Category: Uncategorized

2 Responses to “Virtual words, real data”

  1. Do the ISYS File readers work on non-Windows platforms?

    July 1, 2009 22:23 — Aaron

  2. ISYS expects to have Linux support within a few weeks.

    July 9, 2009 10:54 — David M. Fishman

Leave a Reply

Go to Blog Front Page

  • Recent Posts

    • Lucene Revolution 2012 – Call for Participation now open!
    • SolrCloud is Coming (and looking to mix in even more ‘NoSQL’)
    • Our Solr Reference Guide updated for v3.5
    • Enhancing Discovery with Solr and Mahout – session slides now available!
    • Solr and LucidWorks feature matrix available
    • LucidWorks Enterprise latest version 2.0.1 released!
    • Why Not AND, OR, And NOT?
    • Options to tune document’s relevance in Solr
    • Dallas JavaMUG December 14th 2011
    • Apache Mahout user meeting – session slides and videos are now available!
  • Archives

    • January 2012
    • December 2011
    • November 2011
    • October 2011
    • September 2011
    • August 2011
  • Tags

    acts_as_solr apache Apache Mahout best practices chump code4lib dismax drupal enterprise search Erik Hatcher field collapsing function query Grant Ingersoll hoss image isfdb local params Lucene lucene revolution LucidGaze lucid imagination Mahout Marc Krellenstein Mark Miller nested queries nutch Open Source Open Source Search qparser query parser queryparser Rails release result grouping Richmond Ruby schema design sint Solr solr 3.1 solr 4.0 solr cloud sortable Tika VA
  • Contact Us
  • About Lucid Imagination
  • Help & Support
  • Training
  • Privacy Policy
  • Legal Terms of Use
  • Copyrights and Disclaimers
  • Log in

Apache Solr, Solr, Apache Lucene, Lucene and their logos are trademarks of the Apache Software Foundation.

© 2011 Lucid Imagination. All Right reserved.

loading Cancel
Post was not sent - check your email addresses!
Email check failed, please try again
Sorry, your blog cannot share posts by email.