schema
schema copied to clipboard
Consider adding apostrophe tokenfilter
As reported in https://github.com/pelias/pelias/issues/847, we can improve fuzzy-matching by applying an apostrophe tokenfilter.
https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-apostrophe-tokenfilter.html
We're currently removing apostrophe characters in the punctuation
filter.
The effect of this is to convert mcdonald's => mcdonalds
.
I had a play with introducing the apostrophe tokenfilter linked above (and also removing apostrophe from the punctuation filter):
The effect of this is mcdonald's => mcdonald
.
What we really need is a method where all three of these compositions are considered equal:
mcdonald's
mcdonalds
mcdonald