Govind
Govind
@ashvardanian , not yet, I am trying to figure out what could be the ideal benchmark (and also learning how to use TF-IDF) In scikitlearn, the TfIdfVectorizer creates an 2d...
will do. Thanks for the pointers ! Also reading through the scikit implementation to see how I can possibly do this :)
Ah, I see. now it makes sense ``` SIMSIMD_INTERNAL simsimd_distance_t _simsimd_cos_normalize_f64_neon(simsimd_f64_t ab, simsimd_f64_t a2, simsimd_f64_t b2) { if (a2 == 0 && b2 == 0) return 0; if (ab ==...
I was able to get a benchmark of running the cosine search across a database against a query and the SimSIMD version run upto 5x faster than the plain Rust...
> you can directly compute the weighted dot product of TF-IDF vectors without normalizing them, as required for cosine similarity. This is both faster and aligns naturally with the sparse...
Sure, will move the benchmark code to the repo linked !
Hi @ashvardanian , I noticed that there isn't a NEON implementation for `simsimd_spdot_weights_u16` (or `simsimd_spdot_counts_u16` either). I think it might make sense to add it as part of the benchmark...
Hi @ashvardanian , a Happy New Year, I didn't have a lot of time to work on this, but could squeeze out some time during the holidays to finish up...