cuvs
cuvs copied to clipboard
[FEA] Enable NN Descent for larger data dimensions
Description
Currently, GNND::build() has a preprocess_data_kernel, which requires shared memory size of;
sizeof(Data_t) * ceildiv(build_config_.dataset_dim, static_cast<size_t>(raft::warp_size())) * raft::warp_size()
Considering sizeof(Data_t) = 4 for standard cases such as fp32, and considering a shared memory size of 48KB, this restricts the dataset dimension to be smaller than 12000.
It would be nice to have such features for datasets with large dimensions (> 20000), which is the case for single cell RNA datasets (e.g. datasets here).