MAESTRO icon indicating copy to clipboard operation
MAESTRO copied to clipboard

Enhancement: Optimize the memory usage for calculating gene score (RP) for scATAC-seq with a large number of cells

Open taoliu opened this issue 6 years ago • 2 comments

I have to process a scATAC-seq dataset with over 48,000 cells after merging three conditions and 2 replicates each condition. I kept 200,000 top peaks. The scATAC_cellranger_count.py can finish successfully, however, even our high memory node (260G mem) of our HPC keeps killing scATAC_genescore.py. We need to optimize the memory usage of this script. This issue ticket will track the progress of the optimization.

taoliu avatar Feb 18 '20 21:02 taoliu

Thanks Tao! We have updated the code to improve the memory efficiency of MAESTRO, please update let us know if you still encountering the memory issues. Also, we are currently working on supporting multiple samples. I will let you know once we finished.

chenfeiwang avatar Apr 14 '20 21:04 chenfeiwang

How is the memory usage for now? I have not tested it for such a big number of cells.

crazyhottommy avatar Oct 07 '20 02:10 crazyhottommy