rapids_singlecell icon indicating copy to clipboard operation
rapids_singlecell copied to clipboard

Multi-GPU support with dask

Open Intron7 opened this issue 1 year ago • 1 comments

This adds dask support

Functions to add:

  • [x] calculate_qc_metrics
  • [x] normalize_total
  • [x] log1p
  • [x] highly_variable_genes with seurat and cell_ranger
  • [x] scale
  • [x] PCA
  • [ ] neighbors

Intron7 avatar Apr 25 '24 13:04 Intron7

There will be a seperate PR for the update of the docstrings and a tutorial.

Intron7 avatar Oct 01 '24 08:10 Intron7

I renamed the functions for QC and renamed some of the variables so its a bit clearer whats happening.

Intron7 avatar Nov 13 '24 11:11 Intron7

https://github.com/scverse/rapids_singlecell/pull/179/files#r1838498091 is not done and from what I can tell #179 (review) has not been addressed. What happens if you pass a csc dask array to pca?

That will just error. And tell the user to please give me dense or csr as meta. I updated _check_gpu_X to reflect that.

The median I'll test today

Intron7 avatar Nov 14 '24 11:11 Intron7

We should look into the cost of allocating ahead of time for all operations that are currently in-place

ilan-gold avatar Nov 14 '24 17:11 ilan-gold

Median out of core is a bad choice. Uses way more memory and is slower. Loose Loose

Intron7 avatar Nov 21 '24 12:11 Intron7