jvector icon indicating copy to clipboard operation
jvector copied to clipboard

JVector: the most advanced embedded vector search engine

Results 20 jvector issues
Sort by recently updated
recently updated
newest added

The DiskANN paper describes building larger-than-memory indexes by partitioning the dataset and then adding vectors to multiple partitions, then combining the graphs. This is 2.5x slower than building the graph...

Hi All, Most of the vectorized code in [SimdOps.java ](https://github.com/jbellis/jvector/blob/main/jvector-twenty/src/main/java/io/github/jbellis/jvector/vector/SimdOps.java) is using fromArray API to load the contents into vector. With JDK-20+ Vector API added the support for loading and...

The `GraphIndexBuilder` api can be used in two ways: for live indexing or bulk indexing. We should enforce checks in the api such that users don't call it incorrectly or...

At this point, profiles of our PQ look like it's almost entirely using distance work. Barring large parameter changes or a paradigm shift in how we quantize, it seems like...

We should automate the release process once we're comfortable with the results. I'd prefer a workflow that runs when an appropriate tag is pushed, but I'm open to other options.

First off, great work! --- It'd be very helpful if there were general documentation which helped map the theory and concepts to the class hierarchy or the main facades. That...

CAGRA performs compression with two stages: **Vector Quantization (VQ)**, where kmeans is applied to the full-dimensional vectors to create a codebook of coarse cluster centers **Product Quantization (PQ)**, where the...

The JVector jar available through Maven Central packages JVector 11 on the regular class path, with jvector-twenty and jvector-native adding additional classes through multi-release JAR support. This means that users...

With ANN search, we accept giving up accuracy for speed. Since most of the code in [jvector_simd.c](https://github.com/jbellis/jvector/blob/main/jvector-native/src/main/c/jvector_simd.c) deals in floating-point computations, it may make sense to pass—fp-model=fast to the GCC...

In 97e523c306ae42c3e963484e320fa1c7432b5250 `approximateCentroid()` implementation for the `BuildScoreProvider` returned from `BuildScoreProvider.randomAccessScoreProvider()` was updated to allow for non-sequential node IDs. However the iteration only takes into account nodes with ID < `ravv.size()`....