matchms icon indicating copy to clipboard operation
matchms copied to clipboard

Vectorize similarity calculations

Open tornikeo opened this issue 9 months ago • 0 comments

Is your feature request related to a problem? Please describe. Using Numpy matmul for some vector similarity functions, instead of @njit, can speed up processing from

CPU times: user 2.05 s, sys: 19.4 ms, total: 2.07 s
Wall time: 2.41 s

to:

CPU times: user 47.4 ms, sys: 8.95 ms, total: 56.4 ms
Wall time: 62.7 ms

On 1k x 1k spectra, each of 1k size. Purely on CPU.

Here's a colab notebook to replicate the speedup.

Describe the solution you'd like I already have some functions vectorized over at cudams. I'd like to move some of them to base matchms repo.

If that sounds fine with that proposal, I'll open a PR for it in a bit. Maybe there's some reason to avoid adding vectorized version (maybe somehow lower readability?)

tornikeo avatar May 03 '24 11:05 tornikeo