SimSIMD
SimSIMD copied to clipboard
Feature: How is WASM Compiled? Are you using wasm_simd128 and msimd128 ?
Describe what you are looking for
WASM support for SIMD discussion: https://github.com/emscripten-core/emscripten/issues/12714 emscripten SIMD Docs https://emscripten.org/docs/porting/simd.html wasm vs wam with msimd benchmark https://jeromewu.github.io/improving-performance-using-webassembly-simd-intrinsics/ demo code with wasm_simd https://github.com/jeromewu/wasm-perf/blob/main/mul_mats_intrin.c
Here's where the NGT algorithm, which is faster than HNSW, uses SIMD to optimize internally: https://github.com/yahoojapan/NGT/blob/1e44fffc2b95b211ff29ee693abb4a25057042d4/lib/NGT/Clustering.h#L224
Can you contribute to the implementation?
- [X] I can contribute
Is your feature request specific to a certain interface?
It applies to everything
Contact Details
No response
Is there an existing issue for this?
- [X] I have searched the existing issues
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Hi @vtempest! We don't currently compile to WASM, but it should be compatible with SimSIMD NEON kernels, I believe. What exactly are you looking for?
I would like to help build the fastest Simd accelerated vector search. If usearch uses simsimd and compiles to wasm, I'd love to improve upon it and build my vsearch fork with ram limited clusters.
Yes, USearch compiles to WASM, but the whole ecosystem is currently fragmented, and it's not clear how to ship library dependencies for WASM. Still, it shouldn't be hard to integrate USearch directly into an arbitrary project that uses WASM, and then compile together as a monorepo. Have you tried that?
Yes I was working on top of the original hnswlib ported to wasm here: https://github.com/kaiobarb/hnswlib-wasm?tab=readme-ov-file I am wondering how to integrate usearch instead as the base lib and then add to it the cluster splitting for RAM limits.
https://github.com/yahoojapan/NGT/issues/168#issuecomment-2363499081 NGT inventor says my ram-limited clusters approach looks very promising. I'd like to integrate it into hnsw and usearch v3
@vtempest, have you tried compiling SimSIMD into WASM already? I think the NEON backend should be compatible and can provide a huge boost for i8/u8/f16/bf16 vectors in any search engine, be it USearch or HNSW lib 🤗
Is a WASM build still needed ? I can take a look if so.
@Sero1000, wouldn’t hurt, I assume. But it can be implemented in multiple ways - compiling the Rust SDK to WASM, JS, or Python through cibuildwheel + Pyodide. But before all that, we need the core C library to pass compilation 🤗
Would that approach (pyodide, etc) work in cloudflare workers?
Check out my usearch / vsearch demo. It has getEmbeddingModel, convertTextToEmbedding, addEmbeddingVectorsToIndex, and exports/imports the vector bin for a specific file to a base64 string saved with that file. this avoids the large ram need for scaling. There is a demo of common quotes and inference on a query. https://github.com/vtempest/ai-research-agent/blob/53c952c885d5e34f0cab0baa638c78d9ad2e6f14/src/similarity/usearch.js#L9
I would like to make it work by not needing the fs with the load/save function. There should be a export/importBase64String function
Would that approach (pyodide, etc) work in cloudflare workers?
@vtempest, I am not familiar and wouldn't have the time to investigate.