schema icon indicating copy to clipboard operation
schema copied to clipboard

Consider adding apostrophe tokenfilter

Open missinglink opened this issue 4 years ago • 1 comments

As reported in https://github.com/pelias/pelias/issues/847, we can improve fuzzy-matching by applying an apostrophe tokenfilter.

https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-apostrophe-tokenfilter.html

missinglink avatar Mar 02 '20 14:03 missinglink

We're currently removing apostrophe characters in the punctuation filter. The effect of this is to convert mcdonald's => mcdonalds.

I had a play with introducing the apostrophe tokenfilter linked above (and also removing apostrophe from the punctuation filter): The effect of this is mcdonald's => mcdonald.

What we really need is a method where all three of these compositions are considered equal:

mcdonald's
mcdonalds
mcdonald

missinglink avatar Mar 02 '20 15:03 missinglink