langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Add cosine_simirarity_v1 which is more faster than the original cosine_similarity method.

Open hwaking opened this issue 2 years ago • 3 comments

add cosine_simirarity_v1 method which is 5 times faster than the orginal cosine_similarity method.

add cosine_simirarity_v1 method which is 5 times faster than the orgianl cosine_similarity method.

@vowelparrot update math_utils.py file add cosine_simirarity_v1 which is more faster than the original cosine_similarity method. the time cost test like this: ''' x = [[1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4]] * 2000 y = [[1, 2, 3, 5, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4], [1, 2, 2, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4]] * 30000 resp = cosine_similarity(x, y) resp1 = cosine_simirarity_v1(x, y) print('resp: ', resp) print('score_list: ', resp1, type(resp1))

output: Function cosine_similarity Took 1.8206 seconds Function cosine_simirarity_v1 Took 0.3706 seconds

resp: [[0.99663517 0.99597014 0.99663517 ... 0.99597014 0.99663517 0.99597014] [0.99663517 0.99597014 0.99663517 ... 0.99597014 0.99663517 0.99597014] [0.99663517 0.99597014 0.99663517 ... 0.99597014 0.99663517 0.99597014] ... '''

hwaking avatar May 21 '23 12:05 hwaking

@vowelparrot

hwaking avatar May 21 '23 12:05 hwaking

Thanks for the PR! The reason we didn't use sklearn is we don't want to add it as a required dependency. We could add an import check at the top and use sklearn if it's available though.

vowelparrot avatar May 22 '23 02:05 vowelparrot

Thanks for the PR! The reason we didn't use sklearn is we don't want to add it as a required dependency. We could add an import check at the top and use sklearn if it's available though.

vowelparrot avatar May 22 '23 02:05 vowelparrot

if folks wanted to use this could they just import sklearn directly wherever it's actually being used? not sure how much value the wrapper adds

dev2049 avatar May 22 '23 18:05 dev2049

@hwaking Hi , could you, please, resolve the merging issues and address the last comments (if needed)? After that, ping me and I push this PR for the review. Thanks! Please, let me know if you need help! The example of the "on-demand import" is here.

leo-gan avatar Sep 13 '23 20:09 leo-gan

Closing because the PR wouldn't line up with the current directory structure of the library (would need to be in /libs/langchain/langchain instead of /langchain). Feel free to reopen against the current head if it's still relevant!

efriis avatar Nov 07 '23 04:11 efriis