seurat icon indicating copy to clipboard operation
seurat copied to clipboard

PrepSCTFindMarkers() is slow: "Recorrecting SCT counts using minimum median counts"

Open lwtan90 opened this issue 1 year ago • 3 comments

Hi,

I love using Seurat for all single-cell analysis, but I can't help to provide a user feedback on the PrepSCTFindMarkers() function. It is extremely slow even for a 8000 cells data sets. Are there any way to speed this up? Thank you.

Wilson

lwtan90 avatar Aug 15 '22 23:08 lwtan90

same issue right now, half hour to run findallmarkers for a similar datasize, but 3 hours have past, no any other info except "Recorrecting SCT counts using minimum median counts: xxxx"

Ruismart avatar Aug 16 '22 09:08 Ruismart

Thanks for the report. We are aware of this and will work on improving the speed. It can particularly be slow if there are multiple large datasets since the model needs to regenerate corrected counts for each one of them separately. One way to avoid this would be to pre-calculate the mininum of the median UMI across datasets and pass this to SCTransform, something like SCTransform(scale_factor=5159) where 5159 is the minimum of median(object$nCount_RNA) across datasets - you can skip the PrepSCTFindMarkers() step this way.

saketkc avatar Aug 26 '22 15:08 saketkc

I am crying. I have got 98 datasets and 2M cells to run. I am desperate now.

realzehuali avatar Sep 05 '22 07:09 realzehuali