bert-solr-search icon indicating copy to clipboard operation
bert-solr-search copied to clipboard

define threshold to cosineSimilarity

Open om35 opened this issue 2 years ago • 3 comments

hello, I want to filter my result to get only results with : 1.0 + cosineSimilarity(params['query_vector'], 'vector') > 0.5

it is possible to do this please ?

om35 avatar Jul 27 '23 11:07 om35

hi @om35 ! Thanks for your interest in the project!

I think the easiest to do score cut-offs is on the client, inside one of the search_demo_*.py scripts. Which search engine are you using?

DmitryKey avatar Jul 28 '23 07:07 DmitryKey

hello @DmitryKey , thank you for your response, i use search_demo_elastic.py with search_method == 'es-vanilla'

query["bool"]["should"].append({ "script_score": { "query": {"match_all": {}}, "script": { "source": "(1.0 + cosineSimilarity(params['query_vector'], 'vector'))", "params": {"query_vector": vector_field} } } })

However, the issue is that it returns a set of documents ranked by relevance, and the objective is to eliminate articles with a similarity score less than 0.5

om35 avatar Jul 28 '23 07:07 om35

Thanks for the context - I think, this question is more about capabilities of Elasticsearch. But I'm trying to figure it out in kibana and some pointers, like https://stackoverflow.com/questions/39106243/is-there-are-a-way-to-filter-by-score-in-elasticsearch

DmitryKey avatar Jul 31 '23 14:07 DmitryKey