langchain icon indicating copy to clipboard operation
langchain copied to clipboard

add get_top_k_cosine_similarity method to get max top k score and index

Open hwaking opened this issue 1 year ago • 0 comments

Row-wise cosine similarity between two equal-width matrices and return the max top_k score and index, the score all greater than threshold_score. @vowelparrot @dev2049 @hwchase17

it's useful when we want to get the top k score and index after similarity compute. just like the following example:

input example

x = [[1, 2, 3, 4], [1, 2, 2, 2]] y = [[1, 2, 3, 5], [1, 2, 9, 5], [2, 2, 3, 5]] index_score_list = get_top_k_cosine_similarity(x, y, top_k=2, threshold_score=0.94) print('index_score_list:', index_score_list)

output result

index_score_list: [[(0, 0.9939990885479664), (2, 0.9860132971832692)], [(2, 0.9415130835240085)]]

hwaking avatar May 21 '23 14:05 hwaking