trieve icon indicating copy to clipboard operation
trieve copied to clipboard

feature: add typo tolerance to search

Open skeptrunedev opened this issue 1 year ago • 1 comments
trafficstars

Description

Typo tolerance is a must have for search. We need to implement in a standard way.

  • [ ] research approach to typo tolerance
  • [ ] implement typo tolerance with request payload flag that defaults to false on the search request

Target(s)

server

Requirement to close

PR gets merged which adds typo tolerance

Community channels

Matrix is preferred. Reach out on discord or Matrix for further assistance.

skeptrunedev avatar Aug 02 '24 06:08 skeptrunedev

The starting point here would be a worker that backfills a pg table containing dataset <-> word mappings. This can then be used to construct a dictionary of words to spell check from.

aaryanpunia avatar Aug 02 '24 19:08 aaryanpunia

https://github.com/vtempest/ai-research-agent/blob/master/data/misspelled-typos-8k.json

I've been working on this issue for 3 wks and compiled this list of typos from 3 well updated sources into one that's 8k total pairs. I've been messaging you about integrating how you'd like Instead you ignore and block. Sorry for the confusion as I listed contributing to this project as open source. There is no reason to be such a petty tyrant.

vtempest avatar Aug 20 '24 03:08 vtempest