Ash Vardanian

Results 72 issues of Ash Vardanian

There is an official C SDK for that: https://github.com/kubernetes-client/c

The benchmark will be receiving a directory of Parquet, CSV, Avro or Orca files and exporting them simultaneously into graph, document, and, potentially, path collections. The latter can also be...

good first issue

See 7a91352 for interface placeholder. Library should be implemented using Apache Arrow datasets functionality. The CLI interface will be added later.

CuGraph should be used to accelerate PageRank, Louvain, WCC and Force-based Layout

enhancement

This PR replaces the native Python implementation of the Levenshtein distance with a SIMD-accelerated version from the StringZilla library. On ~5 letter words from a typical English corpus - StringZilla...

metrics

My commit messages look like these: ```sh commit d2b67fb4df02293e0a6478ffa6b61a1d6c502cff (origin/main-dev) Author: Ash Vardanian Date: Mon May 6 00:55:26 2024 +0000 Fix: Build graph dependency commit 4168fcda53a75eb50bde9cd342aa42eebf253422 (main-dev) Author: Ash Vardanian...

All existing metrics imply dense vector representations. Dealing with very high-dimensional vectors, sparse representations may provide huge space-efficiency gains. The only operation that needs to be implemented for Jaccard, Hamming,...

Covariance isn't a distance function, as it can be negative. It however, is often used for similarity search over time-series and should be implemented in SimSIMD.

good first issue

Initial [implementation](https://github.com/ashvardanian/SimSIMD/commit/7461e8acd2c71afefd6054b7034a4d21709db13f) is very slow, due to the costs of cGo calls.

low priority

Both Intel and Apple now have specialized AMX tiled matrix multiplication extensions. Both are tricky to use, but may result in substantial performance improvements. Potentially even for single vector dot-products...