• Products
    • Overview
    • LucidWorks Search Platform
      • Features and Benefits
      • Technical Overview
      • Only with LucidWorks
      • LucidWorks and Solr
      • White Papers
      • LucidWorks Enterprise
      • LucidWorks Cloud
    • Certified Distributions
      • Certified Solr
      • Certified Lucene
    • Apache Releases
      • Apache Solr
      • Apache Lucene
  • Support & Services
    • Overview
    • Support
    • Training
    • Solr/Lucene Certification
    • ExpertLink Advisory
    • Consulting
    • Partners
    • Subscriptions
  • Why Lucid?
    • Why Lucid?
    • Technology
    • Technical Leadership
    • Who uses Lucene/Solr?
      • What customers are saying
    • Case Studies
    • Whitepapers
    • Demos
    • Webinars
  • Blog
  • DevZone
    • DevZone Overview
    • Forums (LWE)
    • Videos & Podcasts
      • How To's
      • Screencasts
      • Podcasts
      • Conference Videos
    • Technical Articles
      • Whitepapers
    • Reference Materials
      • Documentation
      • Solr Reference Guide
      • Solr & LucidWorks Matrix
      • Tutorials
    • Events
      • Conferences
      • Meet Ups
    • Code & Test
  • Downloads
  • About Us
    • Management
    • Careers
    • News
      • Media Coverage
      • Press Releases
    • Contact Us
Sign Up or Log In
Home . Blog

February 16, 2010

Lucene and Logs: Update

Posted by David M. Fishman

A couple more notes on this subject since the Webinar from a couple of weeks ago:

Steve Arnold of Beyond Search asks in a blog post:

…the notion of integrating log files is a good one but I wondered how long it takes to suck big log files, determine deltas, and then update the indexes.

We’ve offered some of the information from the Webinar in a case study we’ve posted about our work with Boomi:

The logging-and-searching service is characterized by frequent commits to make the data available for search; every 5 seconds or 10,000 transaction messages. … There are between two to ten million log transaction generated daily and each may trigger two or more Solr entries. Boomi maintains a rolling 30-day record of log entries.

Not to be outdone, there’s some interesting new input on using Solr for this kind of application from Symplicity,  an integrator who does government and university applications, whose Solr credentials include fbo.gov, a site that searches business opportunities within the Federal government through the General Services Administration:

For a while we used a commercial solution to centralize and search our logs, but they wanted to charge us tens of thousands of dollars for just one gigabyte/day more of indexed data. So I said forget it, I’ll write my own solution!

We already use Solr for some of our other backend searching systems, so I came up with an idea to index all of our logs to Solr. I wrote a daemon in perl that listens on the syslog port, and pointed every single system’s syslog to forward to this single server. From there, this daemon will write to a Solr indexing server after parsing them into fields, such as date/time, host, program, pid, text, etc. I then wrote a cool javascript/ajax web front end for Solr searching, and bam. Real time searching of all of our syslogs from a web interface, for no cost!

  • Share this:
  • Email
  • Facebook
  • Digg
  • Share
  • Print
  • Reddit
  • StumbleUpon

Category: Uncategorized

Leave a Reply

Go to Blog Front Page

  • Recent Posts

    • Lucene Revolution 2012 – Call for Participation now open!
    • SolrCloud is Coming (and looking to mix in even more ‘NoSQL’)
    • Our Solr Reference Guide updated for v3.5
    • Enhancing Discovery with Solr and Mahout – session slides now available!
    • Solr and LucidWorks feature matrix available
    • LucidWorks Enterprise latest version 2.0.1 released!
    • Why Not AND, OR, And NOT?
    • Options to tune document’s relevance in Solr
    • Dallas JavaMUG December 14th 2011
    • Apache Mahout user meeting – session slides and videos are now available!
  • Archives

    • January 2012
    • December 2011
    • November 2011
    • October 2011
    • September 2011
    • August 2011
  • Tags

    acts_as_solr apache Apache Mahout best practices chump code4lib dismax drupal enterprise search Erik Hatcher field collapsing function query Grant Ingersoll hoss image isfdb local params Lucene lucene revolution LucidGaze lucid imagination Mahout Marc Krellenstein Mark Miller nested queries nutch Open Source Open Source Search qparser query parser queryparser Rails release result grouping Richmond Ruby schema design sint Solr solr 3.1 solr 4.0 solr cloud sortable Tika VA
  • Contact Us
  • About Lucid Imagination
  • Help & Support
  • Training
  • Privacy Policy
  • Legal Terms of Use
  • Copyrights and Disclaimers
  • Log in

Apache Solr, Solr, Apache Lucene, Lucene and their logos are trademarks of the Apache Software Foundation.

© 2011 Lucid Imagination. All Right reserved.

loading Cancel
Post was not sent - check your email addresses!
Email check failed, please try again
Sorry, your blog cannot share posts by email.