matchms
matchms copied to clipboard
Vectorize similarity calculations
Is your feature request related to a problem? Please describe.
Using Numpy matmul for some vector similarity functions, instead of @njit
, can speed up processing from
CPU times: user 2.05 s, sys: 19.4 ms, total: 2.07 s
Wall time: 2.41 s
to:
CPU times: user 47.4 ms, sys: 8.95 ms, total: 56.4 ms
Wall time: 62.7 ms
On 1k x 1k spectra, each of 1k size. Purely on CPU.
Here's a colab notebook to replicate the speedup.
Describe the solution you'd like I already have some functions vectorized over at cudams. I'd like to move some of them to base matchms repo.
If that sounds fine with that proposal, I'll open a PR for it in a bit. Maybe there's some reason to avoid adding vectorized version (maybe somehow lower readability?)