Tamas Bela Feher

Results 46 issues of Tamas Bela Feher

PR #2157 enables vector addition to a CAGRA index. Some issues that can come up with large datasets / graph: - graph in host memory not supported by extend https://github.com/rapidsai/raft/pull/2157/files#r1561657193...

feature request
Vector Search

**Describe the bug** Similar to https://github.com/rapidsai/raft/pull/2183, the mean and stdev kernels have also potential out of bounds access https://github.com/rapidsai/raft/blob/branch-24.04/cpp/include/raft/stats/detail/stddev.cuh#L48 https://github.com/rapidsai/raft/blob/branch-24.04/cpp/include/raft/stats/detail/mean.cuh#L46 Additionally the [minmax](https://github.com/rapidsai/raft/blob/67893676f3d9b90e572f78b969172f840115b22f/cpp/include/raft/stats/detail/minmax.cuh#L159minmax) kernel should be also checked, whether it...

bug

**Describe the bug** The [sum kernel](https://github.com/rapidsai/raft/blob/67893676f3d9b90e572f78b969172f840115b22f/cpp/include/raft/stats/detail/sum.cuh#L65) does not handle underflows correctly, and that leads to inaccurate results. **Steps/Code to reproduce bug** As reported by @lijinf2: > We also did an...

bug

**Is your feature request related to a problem? Please describe.** During IVF-Flat search a query vector is compared to all the vectors from `n_probes` clusters, and we have `n_queries *...

feature request

**Describe the bug** `knn_merge_parts` is only implemented for k1024: https://github.com/rapidsai/raft/blob/eb6fdef68d19357e9f44494653ecd4206340ff6b/cpp/include/raft/neighbors/detail/knn_merge_parts.cuh#L149-L171 `knn_merge_parts `is used during brute force search if: - an offset index needs to be added to the indices. This...

bug
Vector Search

Currently CAGRA+HNSW benchmarks with raft_ann_bench require GPU to run. While GPU is essential for building the index with CAGRA, it would be useful to be able to compile and run...

feature request
Vector Search

Currently [neighbors::detail::utils::subsample](https://github.com/rapidsai/raft/pull/2077/files#diff-f4662666209658cc0fc710aae66eb045de253eff2c46339a36daf87e29eaf6e8R612) takes the dataset `input` as plain pointer. The `input` shall be replaced with an mdspan. This is not done in #2077, because the following question needs to be...

feature request

**Describe the bug** When `raft::copy` is used to copy data between two mdspans, the execution time is very slow. **Steps/Code to reproduce bug** Compare the execution time of these loops:...

bug

**Is your feature request related to a problem? Please describe.** For IVF-Flat ad IVF-PQ index building, large datasets are provided in host memory or as `mmap`-ed file. After the cluster...

feature request
Vector Search

In IVF-Flat and IVF-PQ, we generate random indices and shuffle or subsample the dataset using these indices before training. Currently a fixed seed is used to generate random indices. This...

feature request
Vector Search