Accessing words around a positional match in Lucene
From time to time, users on the Lucene mailing list ask a variant of the following question:
Given a term match in a document, what’s the best way to get a window of words around that match?
Getting a window of words around a match can be useful for a lot of things, including, to name a few:
- Highlighting (although I’d recommend using Lucene’s Highlighter package for that)
- Co-occurrence analysis
- Sentiment analysis
- Question Answering
Unfortunately, …