explicit-semantic-analysis
explicit-semantic-analysis copied to clipboard
Lucene: exception - Query parser encountered <EOF> after “some word”
I got a problem when trying to read a dataset with special characters and trying to get the concept vector. This is easily solve by adding the escape function in the Vectorizer class
public ConceptVector vectorize(String text) throws ParseException, IOException {
Query query = queryParser.parse(**QueryParser.escape(text)**);
TopDocs td = searcher.search(query, conceptCount);
return new ConceptVector(td, indexReader);
}
Great implementation by the way! Thanks
Source: https://stackoverflow.com/questions/10259907/lucene-exception-query-parser-encountered-eof-after-some-word/10259944
Thanks for using ESA, and even more for your feedback!
text
is expected to be plain text, without control characters (such as quotes to combine multiple words into a single token), so I think your solution is correct.
Do you want to issue a pull request with the change and a unit test or two? Then your contribution will be carved into stone.