trieve
trieve copied to clipboard
feature: add typo tolerance to search
Description
Typo tolerance is a must have for search. We need to implement in a standard way.
- [ ] research approach to typo tolerance
- [ ] implement typo tolerance with request payload flag that defaults to false on the search request
Target(s)
server
Requirement to close
PR gets merged which adds typo tolerance
Community channels
Matrix is preferred. Reach out on discord or Matrix for further assistance.
The starting point here would be a worker that backfills a pg table containing dataset <-> word mappings. This can then be used to construct a dictionary of words to spell check from.
https://github.com/vtempest/ai-research-agent/blob/master/data/misspelled-typos-8k.json
I've been working on this issue for 3 wks and compiled this list of typos from 3 well updated sources into one that's 8k total pairs. I've been messaging you about integrating how you'd like Instead you ignore and block. Sorry for the confusion as I listed contributing to this project as open source. There is no reason to be such a petty tyrant.