• Products
    • Overview
    • LucidWorks Search Platform
      • Features and Benefits
      • Technical Overview
      • Only with LucidWorks
      • LucidWorks and Solr
      • White Papers
      • LucidWorks Enterprise
      • LucidWorks Cloud
      • LucidWorks Big Data
    • Apache Releases
      • Apache Solr 4.0-dev
      • Apache Lucene
  • Support & Services
    • Overview
    • Support
    • Lucid University
    • ExpertLink Advisory
    • Consulting
    • Partners
    • Subscriptions
  • Why Lucid?
    • Why Lucid?
    • Technology
    • Who uses Lucene/Solr?
      • What customers are saying
    • Case Studies
    • Whitepapers
    • Demos
    • Webinars
  • Blog
  • DevZone
    • DevZone Overview
    • Forums (LWE)
    • Videos & Podcasts
      • How To's
      • Screencasts
      • Podcasts
      • Conference Videos
    • Technical Articles
      • Whitepapers
    • Reference Materials
      • Documentation
      • Solr Reference Guide
      • Solr & LucidWorks Matrix
      • Tutorials
    • Events
      • Lucene Revolution
      • Tradeshows & Conferences
      • Meet Ups
    • Code & Test
  • Downloads
  • About Us
    • Management
    • Board of Directors
    • Apache Lucene/Solr Committers
    • Careers
    • News
      • Media Coverage
      • Press Releases
    • Contact Us
Sign Up or Log In
Home . Support & Services

Topics for Study | Lucid Apache Solr/Lucene Developer Certification

PDF version

In order to prepare for the Lucid Solr/Lucene Developer Certification exam, you should be comfortable with the following topics:

  • Lucene Background
  • Indexing
  • Searching
  • Debugging Solr
  • General Solr knowledge
  • Architecting a Deployment
  • General search and web application environment topics

Details on concepts included in each topic area are included below.

Lucene Background

  • Understand Lucene scoring methods such as TF-IDF and how to debug Lucene’s output.

  • Understand Lucene payloads.

  • Understand merge algorithms used by Lucene.

  • Be familiar with the Lucene index file format.

  • Understand how Lucene uses segments and merging, and how this impacts the performance of an application.

  • Be familiar with the different types of queries, such as term queries, phrase queries, and wildcard queries.

  • Be familiar with the different options available when defining a field, such as storage, indexing, analysis, vectors, positions, and frequencies.

  • Know what Lucene options can be tuned through Solr.

     

Indexing

  • Understand the advantages and disadvantages of distributed indexing strategies, and how they affect performance and capabilities of an application.

  • Know how to configure indexing so that faceting is available.

  • Be familiar with indexing best practices, such as batching adds, multi-threaded adds, and using a streaming update server.

  • Understand how indexing best practices can affect searching, highlighting, faceting, and sorting.

  • Be familiar with the capabilities of Solr Cell and the parameters involved.

  • Understand the DataImportHander configuration file format.

  • Be familiar with available DataImportHandler entity processors and transformers.

  • Know how to debug the DataImportHandler.

  • Understand the parameters and trade-offs of various commit strategies.

  • Be familiar with the update processor chain.

  • Understand use cases for full builds and incremental builds.
  • Be able to recognize when a full rebuild is necessary.

  • Understand index writer configuration.

  • Be familiar with the various update request handlers, such as XML, CSV, and so on, and their default URLs.

 

RETURN TO TOP

Searching

  • Understand security filters.

  • Understand how an analyzer works.

  • Be familiar with the syntax for the Lucene query parser.

  • Be familiar with the syntax for the Dismax query parser.

  • Be familiar with common query parameters.

  • Understand how Solr's cache is used by features such as filters and faceting.

  • Understand the local parameter syntax.

  • Know what kinds of query parsers are available to Solr.

  • Be familiar with the best usage for each type of query parser.

  • Be familiar with use cases for faceting.

  • Understand the parameters necessary to use search faceting.

  • Understand faceting algorithms as implemented in Solr.

  • Be familiar with how various features such as searching, faceting, sorting, caching, and warming affect memory requirements in Solr.

  • Know when to use filters instead of modifying the main query.

  • Be familiar with the parameters necessary to use highlighting.

  • Understand Solr spatial functions and request parameters, such as field types, functions, and request parameters.

  • Understand the use cases for field collapsing.

  • Be familiar with the parameters involved in field collapsing.

  • Understand how Solr implements field collapsing.

  • Be familiar with Solr's boost function, and how it can be used to affect the position of a returned result based on factors such as recency, popularity, price, and so on.

  • Understand boosting based on proximity.

  • Be familiar with the syntax used to find the time elapsed between date values.

  • Be familiar with the search request handler, and how it can be used with search components.

  • Be able to list the built-in search components included with Solr, such as elevated queries, faceting, highlighting, “more like this”, and spellcheck.

  • Be able to list the built-in response writers, such as XML, JSON, Java, and PHP.

  • Understand function queries and how they can be used in a search.



Debugging Solr

  • Understand the debug parameters provided by Solr, and how they affect output.

  • Be familiar with the debugging tools provided with Solr, such as analysis.jsp, the schema browser, the Luke request handler, and the stats page.

 

RETURN TO TOP

General Solr

  • Understand what security options are available within Solr, and how to configure them.

  • Be familiar with the various configuration files needed to run Solr, such as solr.xml, solrconfig.xml, and schema.xml.

  • Know how to deploy Solr to an existing web container.

  • Know how to propagate environment variables through to a Solr configuration.

  • Know how to tell Solr to store data in a non-default location.

  • Understand the field types built into a standard Solr installation, and their options.

  • Understand the difference between query analyzers and index analyzers, and how they relate to each other in the context of an application.

  • Be familiar with techniques Solr uses to perform analysis.

  • Be familiar with best practices related to search, such as domain modeling and knowing when to use features such as OmitNorms.

  • Know when to use index boosting rather than query boosting, and vice versa.

  • Understand how to add custom code to the Solr classpath.

  • Understand Solr cache settings and how they impact performance.

  • Understand why dynamic fields exist, and when you should use them.

  • Be familiar with the Solr external API.

  • Be familiar with the various Solr client libraries, and how they can be used to build an application.

  • Know where to find the Solr logs, and how they can be used in troubleshooting an application.

  • Be familiar with various relevancy manipulation strategies.

  • Understand the "cost" of a commit, and how it affects an application.

  • Be familiar with Solr plug-in hooks and capabilities.

  • Be familiar with the techniques necessary for writing a Solr plug-in.

  • Understand the parameters related to replicating a Solr server.

  • Understand the failure modes of various components, such as replication and the DataImportHandler.

  • Know what features are available in various Solr releases.

  • Know where various components can be found within the Solr directory structure.

  • Understand the core Solr admin APIs.

  • Be familiar with the Solr source code.

  • Be familiar with the configuration options available in solrconfig.xml, and where to find them.

  • Know how to reference external file resources from within Solr's configuration.

  • Understand analysis techniques for both Western languages and Asian scripts.

 

RETURN TO TOP

Architecting a Deployment

  • Know how to set up distributed search so that a single request can be processed by multiple Solr instances.

  • Understand the organization of a cluster of Solr servers.

  • Know when it is appropriate to use distribution search.

  • Be familiar with use cases that indicate replication, and the best practices involved.

  • Final questions and answers created and rated on what developers would need to know



General Background Topics

  • Know how to execute a Java application from a command line environment.

  • Know how to install and test a Java development environment.

  • Know how to create, compile, and run a Java application.

  • Understand and be able to manage JNDI initialization parameters.

  • Understand Java configurations, such as garbage collection options and memory allocation, and how they can impact performance.

  • Know how to use techniques such as debugging, memory tools, or examining stack traces or GC logs to troubleshoot Java applications.

  • Understand how Java libraries are packaged and deployed.

  • Be familiar with configuring a web container in order to perform tasks such as changing port numbers.

  • Understand performance monitoring techniques such as measuring query speed or query throughput.

  • Be familiar with network protocols such as HTTP.

  • Be familiar with common application server internals, such as classloader hierarchies and logging configurations.

  • Be familiar with XML, and be able to troubleshoot issues such as markup errors or unescaped special characters.

  • Be comfortable with basic administrative tasks for a *nix system, such as installing packages and managing file permissions.

  • Understand how Solr interacts with various file systems such as ext*, HFS Plus, and NTFS, and how those interactions can impact performance.

  • Understand the differences between various operating systems, filesystems, and Java virtual machines, and how they can impact performance.

  • Understand general information retrieval techniques, such as inverted indexes, calculating relevance, and common techniques for implementing search.

 

RETURN TO TOP

Next Steps

Get Training

Get Started

Solr Training

DevZone

Latest Blog Post

Solr 4 preview: SolrCloud, NoSQL, and more
The first alpha release of Solr 4 is quickly approaching, bringing powerful new features to enhance existing Solr powered applications, as well as enabling new applications by...
  • Tutorials
  • Blog
  • Whitepapers
  • Docs
  • Forums
  • Support
Share
Follow Facebook Twitter LinkedIn YouTube
RSS Feed
  • Contact Us
  • About Lucid Imagination
  • Help & Support
  • Training
  • Website Feedback
  • Privacy Policy
  • Legal Terms of Use
  • Copyrights and Disclaimers
  • Sitemap
  • Admin

Apache Solr, Solr, Apache Lucene, Lucene and their logos are trademarks of the Apache Software Foundation.

© 2012 Lucid Imagination. All Right reserved.