rvtests icon indicating copy to clipboard operation
rvtests copied to clipboard

Rvtests SKAT,SKATO keeps on running after removing one gene from '--gene' list

Open ChakrabortyShreya opened this issue 2 years ago • 0 comments

Hi,

I am trying to run SKAT and SKATO for 20 genes with 105977 samples using the following syntax:

rvtest -inVcf input.vcf.gz --pheno All_pheno.ped --pheno-name A --covar All_pheno.ped --covar-name B,C,D,E --inverseNormal --useResidualAsPhenotype --impute hwe --geneFile ../../refFlat_hg38.txt.gz --gene GENE1,GENE2,...,GENE20 --kernel skat,skato --out Test

Previously when I ran the test on 21 genes, the analysis completed in 963 seconds. Then, I just removed one of the 21 genes from the comma separated gene list, and keeping all other input same I ran the analysis but this time the program has been running for more than 7 days on our HPC. The output files are completely empty. I am running the code on a node with infinite time limit , 72 cores with multi-threading , 187 GB RAM and 99 GB Swap memory. I can see that the program is utilizing one core almost entirely ( 99-100 CPU%)

The terminal shows this:

[INFO] DONE: Fit model [ phenotype ~ 1 + covariates ] and model residuals will be used as responses [INFO] Now applying inverse normalization transformation [INFO] DONE: inverse normalization transformation finished [INFO] Analysis begins with [ 105977 ] samples... [INFO] SKAT test significance will be evaluated using 10000 permutations at alpha = 0.05 weight = Beta[beta1 = 1.00, beta2 = 25.00] [INFO] SKAT-O test significance will be evaluated using weight = Beta[beta1 = 1.00, beta2 = 25.00] [INFO] Loaded [ 20 ] genes. [INFO] Impute missing genotype by HWE [INFO] Analysis started

Can you please suggest what could cause this issue when it ran fine for 21 genes?

ChakrabortyShreya avatar Mar 25 '22 07:03 ChakrabortyShreya