openverse-api icon indicating copy to clipboard operation
openverse-api copied to clipboard

Potentially disable matches across word boundaries or enforce matching whole words only

Open sarayourfriend opened this issue 2 years ago • 2 comments

Problem

Right now it seems like our ES query configuration would match dog across word boundaries, for example if an indexed field was __d og__.

Description

Look into how to disable this and if it's easy to do and whether it reduces false-positives (not sure how to measure this).

Alternatives

Match whole words and partial words but not across word boundaries, but rank whole word matches higher than partial matches (this may be the default ES behaviour).

Implementation

  • [ ] 🙋 I would be interested in implementing this feature.

sarayourfriend avatar Aug 26 '22 19:08 sarayourfriend

Do we ever keep the search terms that people enter or a sample of them? It would be interesting to understand how often people enter typos like "d og watcher" vs "chad ogles the cat sitter".

rwidom avatar Aug 27 '22 12:08 rwidom

We do not, I don't think we have any analytics presently. That's a great point though!

AetherUnbound avatar Aug 29 '22 17:08 AetherUnbound