langchainrb
langchainrb copied to clipboard
Ability to specify the distance threshold when calling similarity_search
Description
In Discord it was asked whether we can specify a distance threshold when calling the Vectorsearch#ask method. The need is to return ALL record based on their relevance score as opposed to returning a static number of k: record.
Tasks
- Explore whether vectorsearch DBs support a distance threshold parameter. If yes -- we should implement it. If no -- we should not because then it could be done on the client side.
- Modify
vectorsearch#ask(),vectorsearch#similarity_search_by_vector()andvectorsearch#similarity_search()methods to acceptdistance_gte:("distance greater than or equal") parameter to set this threshold.
Note: We might need to normalize/standardize the distance scores that various vectorsearch engines return.
I believe this is the same issue https://github.com/patterns-ai-core/langchainrb/issues/249 Will probably close the earlier one since this one has a better description.