ANN-SoLo icon indicating copy to clipboard operation
ANN-SoLo copied to clipboard

Low-memory support

Open bittremieux opened this issue 3 years ago • 0 comments

Running ANN-SoLo can lead to excessive memory requirements:

  • [ ] The candidate mask takes up O(num_candidates x num_library_spectra) memory. For a default batch size of 16,384 and a spectral library of 4 million spectra, this requires more than 8 GB (best-case scenario: 1-bit booleans). This memory requirement is duplicated for the ANN mask. A potential solution would be to iterate over batches of library candidates as well.

  • [ ] The ANN index needs to fit into the GPU memory, which will be problematic for large spectral libraries or low-memory GPUs. Potential solution: shard the index. This has some additional benefit that the shards can be processed using multiple GPUs.

bittremieux avatar Mar 15 '21 18:03 bittremieux