scGSVA icon indicating copy to clipboard operation
scGSVA copied to clipboard

GSVA calculation takes extremely long

Open jonrot1906 opened this issue 8 months ago • 3 comments

Dear @guokai8,

thanks for your great package. I am currently struggling a little to use it on my dataset, as the GSVA calculation takes extremely long. I am using a custom gene set in this structure:

GeneID | Annot
PTGS2 | Ferroptosis

And I am running these commands:

gene_set <- read.csv("gene_set.csv")
res<-scgsva(nft_ad,annot=gene_set,method="gsva",useTerm = F)

This produces the following console messages (which look fine in my opinion):

Setting parallel calculations through a MulticoreParam back-end
with workers=4 and tasks=100.
Estimating GSVA scores for 1 gene sets.
Estimating ECDFs with Poisson kernels
Estimating ECDFs in parallel on 4 cores

About 21 iterations (I assume cells) took around 12 hours. I am running this on a M1 Pro MacBook with 32 GB RAM - do you think it will be faster once I switch to a computer with better specifications? I want to run GSVA analysis on around 100000 cells...this would take ages.

I am keen to get your recommendations! Thanks and best regards, Jonas

jonrot1906 avatar Oct 25 '23 09:10 jonrot1906

Hi @jonrot1906 , I am working on the new version now. Will fix this issue soon. thanks! K

guokai8 avatar Nov 17 '23 16:11 guokai8

Hi @jonrot1906 , Now, I am testing two approaches: 1, use batch methods and 2, use sampling methods. I may release the new version in few days. Best, K

guokai8 avatar Nov 22 '23 19:11 guokai8

Hi @jonrot1906 , batch method is available now. And you can also calculate the UCell scores by setting the method="UCell". Now working on the sampling methods K,

guokai8 avatar Nov 28 '23 21:11 guokai8