scGSVA
scGSVA copied to clipboard
GSVA calculation takes extremely long
Dear @guokai8,
thanks for your great package. I am currently struggling a little to use it on my dataset, as the GSVA calculation takes extremely long. I am using a custom gene set in this structure:
GeneID | Annot
PTGS2 | Ferroptosis
And I am running these commands:
gene_set <- read.csv("gene_set.csv")
res<-scgsva(nft_ad,annot=gene_set,method="gsva",useTerm = F)
This produces the following console messages (which look fine in my opinion):
Setting parallel calculations through a MulticoreParam back-end
with workers=4 and tasks=100.
Estimating GSVA scores for 1 gene sets.
Estimating ECDFs with Poisson kernels
Estimating ECDFs in parallel on 4 cores
About 21 iterations (I assume cells) took around 12 hours. I am running this on a M1 Pro MacBook with 32 GB RAM - do you think it will be faster once I switch to a computer with better specifications? I want to run GSVA analysis on around 100000 cells...this would take ages.
I am keen to get your recommendations! Thanks and best regards, Jonas
Hi @jonrot1906 , I am working on the new version now. Will fix this issue soon. thanks! K
Hi @jonrot1906 , Now, I am testing two approaches: 1, use batch methods and 2, use sampling methods. I may release the new version in few days. Best, K
Hi @jonrot1906 , batch method is available now. And you can also calculate the UCell scores by setting the method="UCell". Now working on the sampling methods K,