SimSIMD icon indicating copy to clipboard operation
SimSIMD copied to clipboard

golang: separate c functions and cache simsimd_metric_punned

Open corani opened this issue 1 year ago • 6 comments

The changes move the inline C functions from simsimd.go to a new file, simsimd.c. This separation enhances code organization and readability. It also allows for better management of the C code, which can now be modified independently from the Go code.

More importantly, we now cache the results from simsimd_metric_punned instead of determining the capabilities for each call. This improves the benchmark from 1940ns/op to 1320ns/op on my system.

corani avatar Jan 25 '24 03:01 corani

Note: this is still 4x slower than the native Go implementation, but that's better than 6x 🤣

corani avatar Jan 25 '24 03:01 corani

Hi, @corani! You are right to evaluate the dynamic dispatch just once. I think we should generalize it and implement in an identical way to how I implement it in StringZilla. That is more laborious, but can be reused across different languages.

@pplanel has recently pushed Rust bindings, but they are slower than native Rust code, because he doesn't cache the pointer in any way. In case any of you guys want to implement it, I'm happy to provide guidance, but won't be able to work on it actively in the coming weeks 🤗

ashvardanian avatar Jan 28 '24 03:01 ashvardanian

Hey @ashvardanian, I'm interested know more about this benchmark and how can the pointer caching be done.

The Rust binding benchmark are comparing cosine and sqeuclidean against their respective implementations in SimSIMD.

And I'm seeing this results:

Cosine image

SqEuclidean image

pplanel avatar Jan 28 '24 03:01 pplanel

This is interesting, @pplanel. I must have misread the timings in the console.

The common approach is to have a static structure with pointers, that is populated when the shared library is loaded. Then, all the function calls go through that lookup table. The StringZilla snippet is a pretty good example, I believe.

ashvardanian avatar Jan 28 '24 04:01 ashvardanian

I'm unable to update the PR for resolve the conflict:

 ! [remote rejected] corani/perf -> corani/perf (refusing to allow a Personal Access Token to create or update workflow `.github/workflows/prerelease.yml` without `workflow` scope)

corani avatar Jan 29 '24 02:01 corani

Hi, @corani! You are right to evaluate the dynamic dispatch just once. I think we should generalize it and implement in an identical way to how I implement it in StringZilla. That is more laborious, but can be reused across different languages.

That'll have to be done by someone with actual experience writing C code 😉

corani avatar Jan 29 '24 02:01 corani