rvtests icon indicating copy to clipboard operation
rvtests copied to clipboard

rare variant analysis

Open sbaheti opened this issue 5 years ago • 14 comments

hi

I am having difficulties running the tool and get the output. The analysis skips all the variants for a gene and doesn't give out any results. Do you know what is wrong with the input ?

Thanks !

log file: [INFO] Program version: 20190205 [INFO] Git Version: c86e589efef15382603300dc7f4c3394c82d69b8 [INFO] Parameters BEGIN

ParameterList created by m078940 on mforgehn2 at Thu Mar 7 13:05:43 2019

--inVcf "variants.vcf.gz" --out "out" --pheno "../../../covariate.file.tsv" --burden "cmc" --geneFile "Homo_sapiens.GRCh38.78.mod.GGPS.refFlat" --gene "WASH7P" [INFO] Parameters END [INFO] Analysis started at: Thu Mar 7 13:05:43 2019 [INFO] Loaded [ 26 ] samples from genotype files [INFO] Loaded [ 26 ] sample phenotypes [INFO] Loaded 0 male, 0 female and 26 sex-unknown samples from ../../../covariate.file.tsv [INFO] Loaded 18 cases, 8 controls, and 0 missing phenotypes [WARN] -- Enabling binary phenotype mode -- [INFO] Analysis begins with [ 26 ] samples... [INFO] Loaded [ 1 ] genes. [INFO] Impute missing genotype to mean (by default) [INFO] Analysis started [INFO] Gene WASH7P has 0 variants, skipping [INFO] Analyzed [ 0 ] variants from [ 1 ] genes/regions [INFO] Analysis ends at: Thu Mar 7 13:05:44 2019 [INFO] Analysis took 1 seconds

sbaheti avatar Mar 07 '19 19:03 sbaheti

Can you verify that the variants of the gene exist in the vcf file?

Sent from my iPhone

On Mar 7, 2019, at 1:30 PM, sbaheti [email protected] wrote:

hi

I am having difficulties running the tool and get the output. The analysis skips all the variants for a gene and doesn't give out any results. Do you know what is wrong with the input ?

Thanks !

log file: [INFO] Program version: 20190205 [INFO] Git Version: c86e589 [INFO] Parameters BEGIN

ParameterList created by m078940 on mforgehn2.mayo.edu at Thu Mar 7 13:05:43 2019

--inVcf "/research/bsi/projects/PI/Rakela_Jorge_jxr14/secondary/s210022.ALF/variants/variants.vcf.gz" --out "/home/mayo/m078940/outRakela" --pheno "../../../covariate.file.tsv" --burden "cmc" --geneFile "/research/bsi/data/refdata/ensembl/human/gene/gtf/release-78/processed/2015_02_25/Homo_sapiens.GRCh38.78.mod.GGPS.refFlat" --gene "WASH7P" [INFO] Parameters END [INFO] Analysis started at: Thu Mar 7 13:05:43 2019 [INFO] Loaded [ 26 ] samples from genotype files [INFO] Loaded [ 26 ] sample phenotypes [INFO] Loaded 0 male, 0 female and 26 sex-unknown samples from ../../../covariate.file.tsv [INFO] Loaded 18 cases, 8 controls, and 0 missing phenotypes [WARN] -- Enabling binary phenotype mode -- [INFO] Analysis begins with [ 26 ] samples... [INFO] Loaded [ 1 ] genes. [INFO] Impute missing genotype to mean (by default) [INFO] Analysis started [INFO] Gene WASH7P has 0 variants, skipping [INFO] Analyzed [ 0 ] variants from [ 1 ] genes/regions [INFO] Analysis ends at: Thu Mar 7 13:05:44 2019 [INFO] Analysis took 1 seconds

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

zhanxw avatar Mar 07 '19 23:03 zhanxw

YES the variants exist, i also tried it without specifying any gene and in the log file it records all the genes are skipped as there are no variants. Not sure about the issue.

sbaheti avatar Mar 08 '19 15:03 sbaheti

do you have any other suggestion i can try to make it work ?

sbaheti avatar Mar 25 '19 13:03 sbaheti

i tried it again with a different statistics but still the same error to skip all the variants and this time i didn't specify any genes. Am i missing some parameter ?

[INFO] Program version: 20190205 [INFO] Git Version: c86e589efef15382603300dc7f4c3394c82d69b8 [INFO] Parameters BEGIN

ParameterList Mon Mar 25 09:45:20 2019

--inVcf "variants.vcf.gz" --out "out9" --pheno "covariate.file.tsv" --burden "zeggini" --geneFile "Homo_sapiens.GRCh38.78.mod.GGPS.refFlat" [INFO] Parameters END [INFO] Analysis started at: Mon Mar 25 09:45:20 2019 [INFO] Loaded [ 26 ] samples from genotype files [INFO] Loaded [ 26 ] sample phenotypes [INFO] Loaded 0 male, 0 female and 26 sex-unknown samples from covariate.file.tsv [INFO] Loaded 18 cases, 8 controls, and 0 missing phenotypes [WARN] -- Enabling binary phenotype mode -- [INFO] Analysis begins with [ 26 ] samples... [INFO] Loaded [ 58157 ] genes. [INFO] Impute missing genotype to mean (by default) [INFO] Analysis started [INFO] Gene DDX11L1 has 0 variants, skipping [INFO] Gene WASH7P has 0 variants, skipping ......

sbaheti avatar Mar 25 '19 14:03 sbaheti

Hello @zhanxw, I am getting exactly the same error as @sbaheti. Here is an example to reproduce the error:

refFlat.txt

SLMAP	NM_007159	chr3	+	57757255	57930016	57757651	57927388	21	57757255,57831382,57841298,57847196,57849753,57857732,57858087,57860698,57861948,57864547,57871635,57890040,57896510,57896872,57907883,57909075,57912380,57913157,57916905,57922888,57927295,	57757849,57831530,57841371,57847233,57849816,57857828,57858159,57860839,57862086,57864716,57871698,57890100,57896591,57896932,57908006,57909150,57912701,57913275,57917077,57923023,57930016,
SLMAP	NM_001304422	chr3	+	57889913	57930016	57896880	57927388	10	57889913,57896510,57896872,57907883,57909075,57912380,57913157,57916905,57922888,57927295,	57890100,57896591,57896932,57908006,57909150,57912701,57913275,57917077,57923023,57930016,
SLMAP	NM_001304421	chr3	+	57757255	57930016	57757651	57927388	20	57757255,57831382,57841298,57847196,57849753,57857732,57858087,57860698,57861948,57864547,57890040,57896510,57896872,57907883,57909075,57912380,57913157,57916905,57922888,57927295,	57757849,57831530,57841371,57847233,57849816,57857828,57858159,57860839,57862086,57864716,57890100,57896591,57896932,57908006,57909150,57912701,57913275,57917077,57923023,57930016,
SLMAP	NM_001304423	chr3	+	57889913	57930016	57896880	57927388	8	57889913,57896510,57896872,57912380,57913157,57916905,57922888,57927295,	57890100,57896591,57896932,57912701,57913275,57917077,57923023,57930016,
SLMAP	NM_001304420	chr3	+	57757255	57930016	57757651	57927388	22	57757255,57831382,57841298,57847196,57849753,57857732,57858087,57860698,57861948,57864547,57864806,57871635,57890040,57896510,57896872,57907883,57909075,57912380,57913157,57916905,57922888,57927295,	57757849,57831530,57841371,57847233,57849816,57857828,57858159,57860839,57862086,57864716,57864857,57871698,57890100,57896591,57896932,57908006,57909150,57912701,57913275,57917077,57923023,57930016,
SLMAP	NM_001311178	chr3	+	57889913	57930016	57896880	57925928	11	57889913,57896510,57896872,57907883,57909075,57912380,57913157,57916905,57922888,57925844,57927295,	57890100,57896591,57896932,57908006,57909150,57912701,57913275,57917077,57923023,57925934,57930016,
SLMAP	NM_001311179	chr3	+	57889913	57918482	57896880	57917155	7	57889913,57896510,57896872,57909075,57912380,57913157,57916905,	57890100,57896591,57896932,57909150,57912701,57913275,57918482,

test.vcf.gz

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	S1	S2	S3	S4	S5	S6	S7	S8	S9	S10	S11	S12
chr3	57757839	.	A	G	367.99	PASS	AC=1;AF=0.042;AN=24;BaseQRankSum=-2.812;ClippingRankSum=0;DP=412;ExcessHet=3.0103;FS=4.01;InbreedingCoeff=-0.0435;MLEAC=1;MLEAF=0.042;MQ=60;MQRankSum=0;QD=17.52;ReadPosRankSum=-1.469	GT:AD:DP:GQ:PL	0/1:7,14:21:99:403,0,187	0/0:41,1:42:99:0,114,1465	0/0:38,0:38:99:0,102,1530	0/0:33,0:33:90:0,90,1350	0/0:33,0:33:81:0,81,1215	0/0:30,0:30:78:0,78,1170	0/0:42,0:42:99:0,120,1800	0/0:43,0:43:99:0,117,1755	0/0:29,0:29:81:0,81,1215	0/0:30,0:30:87:0,87,1305	0/0:37,0:37:99:0,105,1575	0/0:34,0:34:87:0,87,1305
chr3	57896989	rs762608342	A	G	304.99	PASS	AC=1;AF=0.042;AN=24;BaseQRankSum=-3.955;ClippingRankSum=0;DB;DP=433;ExcessHet=3.0103;FS=0;InbreedingCoeff=-0.0435;MLEAC=1;MLEAF=0.042;MQ=60;MQRankSum=0;QD=10.52;ReadPosRankSum=0.321	GT:AD:DP:GQ:PL	0/0:36,0:36:99:0,105,1575	0/0:42,0:42:99:0,117,1755	0/0:32,0:32:81:0,81,1215	0/0:43,0:43:99:0,120,1800	0/0:37,0:37:96:0,96,1440	0/0:37,0:37:99:0,108,1620	0/0:45,0:45:99:0,120,1800	0/0:24,0:24:66:0,66,990	0/1:19,14:33:99:332,0,594	0/0:52,0:52:99:0,120,1800	0/0:34,1:35:89:0,89,1237	0/0:41,0:41:99:0,111,1665

test.ped

fid     iid     father_id       mother_id       sex     pheno
A       S1      NA      NA  1       2
A       S2      NA      NA      2       1
A       S3      NA      NA      1       1
A       S4      NA      NA      1       1
B       S5      NA      NA      1       1
B       S6      NA      NA      1       2
B       S7      NA    NA      2       1
C       S8      NA      NA   1       2
C       S9      NA      NA      2       1
C       S10     NA      NA      2       1
C       S11     NA      NA   2       2
C       S12     NA      NA      2       2

Command

/path/to/rvtests_v2.1.0/executable/rvtest --inVcf test.vcf.gz \
        --pheno test.ped --pheno-name pheno \
        --out results --geneFile refFlat.txt --burden cmc,cmcWald,zeggini,zegginiWald --vt price --kernel skat,kbac --gene SLMAP

RVTEST log

Retrieve remote version failed, use '--noweb' to skip.
[INFO]	Program version: 20190205
[INFO]	Analysis started at: Thu Apr 11 15:44:48 2019
[INFO]	Loaded [ 12 ] samples from genotype files
[INFO]	Loaded [ 12 ] sample phenotypes
[INFO]	Loaded 6 male, 6 female and 0 sex-unknown samples from test.ped
[INFO]	Loaded 5 cases, 7 controls, and 0 missing phenotypes
[WARN]	-- Enabling binary phenotype mode -- 
[INFO]	Analysis begins with [ 12 ] samples...
[INFO]	Price's VT test significance will be evaluated using 10000 permutations at alpha = 0.05
[INFO]	SKAT test significance will be evaluated using 10000 permutations at alpha = 0.05 weight = Beta[beta1 = 1.00, beta2 = 25.00]
[INFO]	KBAC test significance will be evaluated using 10000 permutations at alpha = 0.05
[INFO]	Loaded [ 1 ] genes.
[INFO]	Impute missing genotype to mean (by default)
[INFO]	Analysis started
[INFO]	Gene SLMAP has 0 variants, skipping
[INFO]	Analyzed [ 0 ] variants from [ 1 ] genes/regions
[INFO]	Analysis ends at: Thu Apr 11 15:44:48 2019
[INFO]	Analysis took 0 seconds
RVTESTS finished successfully

matmu avatar Apr 11 '19 13:04 matmu

Hello @zhanxw -- are there any updates on this error? I have the same issue and wasn't sure if it was ever resolved. thank you.

erampersaud avatar May 29 '19 20:05 erampersaud

I am facing exactly the same issue. Does anyone have an advice?

ayub1985 avatar Feb 07 '20 03:02 ayub1985

https://github.com/zhanxw/rvtests/issues/80#issuecomment-482121514 https://github.com/matmu Did you manage to resolve this?

ayub1985 avatar Feb 07 '20 03:02 ayub1985

@ayub1985 found out that the program cannot handle vcf files properly if there is "chr" before the chromosome number for the variants.

matmu avatar Mar 19 '20 17:03 matmu

Hi there,

I am having the same problem and I already deleted the "chr" part. Do you have any other sugestion?

Thank you in advance

anna-555 avatar Jan 28 '21 12:01 anna-555

Hi, I'm having the same problem.

Cross-posted on biostars: https://www.biostars.org/p/9504551/

edit: removing the chr prefix fixed the problem.

lindenb avatar Jan 05 '22 20:01 lindenb

I have removed chr from refFlat file and from vcf but I am still getting 0 variants for all genes

jfertaj avatar Jan 25 '22 23:01 jfertaj

You should remove the 'chr' from vcf files not the refFlat file.

Zekexie avatar Aug 16 '23 15:08 Zekexie

Hi there,

I am having the same problem and I already deleted the "chr" part. Do you have any other sugestion?

Thank you in advance

You should remove the 'chr' from vcf files not the refFlat file.

Zekexie avatar Aug 16 '23 15:08 Zekexie