• Products
    • Overview
    • LucidWorks Search Platform
      • Features and Benefits
      • Technical Overview
      • Only with LucidWorks
      • LucidWorks and Solr
      • White Papers
      • LucidWorks Enterprise
      • LucidWorks Cloud
    • Certified Distributions
      • Certified Solr
      • Certified Lucene
    • Apache Releases
      • Apache Solr
      • Apache Lucene
  • Support & Services
    • Overview
    • Support
    • Training
    • Solr/Lucene Certification
    • ExpertLink Advisory
    • Consulting
    • Partners
    • Subscriptions
  • Why Lucid?
    • Why Lucid?
    • Technology
    • Who uses Lucene/Solr?
      • What customers are saying
    • Case Studies
    • Whitepapers
    • Demos
    • Webinars
  • Blog
  • DevZone
    • DevZone Overview
    • Forums (LWE)
    • Videos & Podcasts
      • How To's
      • Screencasts
      • Podcasts
      • Conference Videos
    • Technical Articles
      • Whitepapers
    • Reference Materials
      • Documentation
      • Solr Reference Guide
      • Solr & LucidWorks Matrix
      • Tutorials
    • Events
      • Conferences
      • Meet Ups
    • Code & Test
  • Downloads
  • About Us
    • Management
    • Apache Lucene/Solr Committers
    • Careers
    • News
      • Media Coverage
      • Press Releases
    • Contact Us
Sign Up or Log In
Home . DevZone . Technical Articles . Whitepapers . Indexing Text and HTML Files with Solr

  • DevZone Overview
  • Forums (LWE)
  • Videos & Podcasts
    • How To's
    • Screencasts
    • Podcasts
    • Conference Videos
  • Technical Articles
    • Whitepapers
  • Reference Materials
    • Documentation
    • Solr Reference Guide
    • Solr & LucidWorks Matrix
    • Tutorials
  • Events
    • Conferences
    • Meet Ups
  • Code & Test

Lucid Imagination Technical White Paper: Indexing Text & HTML Files with Solr

Indexing Text and HTML Files Solr, the Lucene Search Server
A Lucid Imagination Technical Tutorial

Apache Solr is the popular, blazing fast open source enterprise search platform; it uses Lucene as its core search engine. Solr’s major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and complex queries. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites. In the past, examples available for learning Solr were for strictly-formatted XML and database records. This new tutorial provides clear, step-by-step instructions for a more common use case: how to index local text files, local HTML files, and remote HTML files. It is intended for those who have already worked through the Solr Tutorial or equivalent. Familiarity with HTML and a terminal command line are all that is required; no formal experience with Java or other programming  languages is needed. System Requirements for this tutorial are those of the Startup Tutorial: UNIX, Cygwin (Unix on Windows), Mac OS X; Java 1.5, disk space, permission to run applications, access to content.

Sign up or login below to download the white paper

Register a New Account

Already registered? Log in here.

Login to Your Account

Not registered yet? Click here to create a new account

Case Study

Closing the Knowledge Gap: A Case Study - How Cisco Unlocks Communications
Solr Development Case Study: resolutionfinder.org

Whitepapers

Programmer's Guide: Using LucidWorks Enterprise to add Search to your Web Application
Getting Started With LucidWorks Enterprise

DevZone

Latest Blog Post

Lucene Revolution 2012 - Call for Participation...
Mark your calendars today! The largest worldwide conference dedicated to Lucene and Solr will take place in Boston May 7-10. The 2012 conference will build on the success of last...
  • Tutorials
  • Blog
  • Whitepapers
  • Docs
  • Forums
  • Support
Share
Follow Facebook Twitter LinkedIn YouTube
RSS Feed
  • Contact Us
  • About Lucid Imagination
  • Help & Support
  • Training
  • Website Feedback
  • Privacy Policy
  • Legal Terms of Use
  • Copyrights and Disclaimers
  • Sitemap
  • Admin

Apache Solr, Solr, Apache Lucene, Lucene and their logos are trademarks of the Apache Software Foundation.

© 2012 Lucid Imagination. All Right reserved.