openverse-api
openverse-api copied to clipboard
Potentially disable matches across word boundaries or enforce matching whole words only
Problem
Right now it seems like our ES query configuration would match dog
across word boundaries, for example if an indexed field was __d og__
.
Description
Look into how to disable this and if it's easy to do and whether it reduces false-positives (not sure how to measure this).
Alternatives
Match whole words and partial words but not across word boundaries, but rank whole word matches higher than partial matches (this may be the default ES behaviour).
Implementation
- [ ] 🙋 I would be interested in implementing this feature.
Do we ever keep the search terms that people enter or a sample of them? It would be interesting to understand how often people enter typos like "d og watcher" vs "chad ogles the cat sitter".
We do not, I don't think we have any analytics presently. That's a great point though!