varianttools
varianttools copied to clipboard
How to export FORMAT fields in vcf format
I have been running the vtools on vcf file containing 3 WES samples (after GATK)
Command:
- I have used the import :
vtools import ../../../align/recalibrated_variants_.vcf --var_info DP filter info --geno_info DP_geno --build hg19
- I have used the export
alt dbSNP.name refGene.name2 refGene.name dbNSFP.SIFT_score
dbNSFP.Polyphen2_HDIV_score Polyphen2_HDIV_pred Polyphen2_HVAR_score
Polyphen2_HVAR_pred dbSNP.func kgDesc --order_by chr pos --header chr
pos ref alt rsname gene 'refgene name' 'SIFT score' 'Polyphen2 HDIV
score' 'Polyphen2 HDIV pred' 'Polyphen2 HVAR score' 'Polyphen2 HVAR
pred' 'dbSNP func code' 'pathway' '%(sample_names)s' 'DP_geno'
--output NS.csv
head NS.csv
chr,pos,ref,alt,rsname,gene,refgene name,SIFT score,Polyphen2 HDIV
score,Polyphen2 HDIV pred,Polyphen2 HVAR score,Polyphen2 HVAR
pred,dbSNP func code,pathway,SRR925784,SRR925788,SRR925803,DP_geno
1,723819,T,A,rs11804171,,,,,,,,unknown,,NA,0,2
There is no field with the depth of each sample also I have imported the --geno_info DP_geno field from the vcf file . Can you please recommend how to get the FORMAT field: GT:AD:DP:GQ:PGT:PID:PL of every
sample? I need the values of AD:DP of every sample.
There is an example on export doc on how to export genotype info. The command used was
% vtools export variant --samples 'sample_name like "NA128%"'\
--geno_info DP_geno --format_string 'GT:DP' -o my.vcf
Please let me know if you --geno_info works for you. Note that the export command is designed to output small number of variants (thousands?) and exporting all variants from WES data tends to be extremely slow.
Hello,
Thank you for your answer,
When I run the command : vtools export variant --samples 'sample_name like "SRR%"' --geno_info DP_ geno --format_string 'GT:AD:DP' --fields variant.chr variant.pos variant.ref variant.alt dbSNP.name clinvar.COMMON refGene.name2 refGene.name dbNSFP.SIFT_score dbNSFP.Polyphen2_HDIV_score dbNSFP.Polyphen2_HDIV_pred dbNSFP.Polyphen2_HVAR_score dbNSFP.Polyphen2_HVAR _pred clinvar.CLNSIG clinvar.CLNSRC clinvar.CLNDBN keggPathway.kgDesc --order_by chr pos --header chr pos ref alt rsname 'Common SNP' gene 'gene name' 'SIFT score' 'Polyphen2 HDIV score' 'Polyphen2 HDIV pred' 'Polylyphen2 HVAR score' 'Polyphen2 HVAR pred' 'dbSNP clinical sig' 'dbSNP clinical source' 'dbSNP disease' 'pathway' "%(sample_names)s" -o my.vcf
I get an ERROR
vtools CMD --format vcf.fmt: error: unrecognized arguments: --fields variant.chr variant.pos variant.ref variant.alt dbSNP.name clinvar.COMMON refGene.name2 refGene.name dbNSFP.SIFT_score dbNSFP.Polyphen2_HDIV_score dbNSFP.Polyphen2_HDIV_pred dbNSFP.Polyphen2_HVAR_score dbNSFP.Polyphen2_HVAR _pred clinvar.CLNSIG clinvar.CLNSRC clinvar.CLNDBN keggPathway.kgDesc --order_by chr pos
But when I run it separately by 2 separated commands and the combine the output it runs ok:
- vtools export variant --samples 'sample_name like "SRR%"' --geno_info DP_ geno --format_string 'GT:AD:DP' -o my.vcf
- vtools export variant --samples 1 --format csv --fields variant.chr variant.pos variant.ref variant.alt dbSNP.name clinvar.COMMON refGene.name2 refGene.name dbNSFP.SIFT_score dbNSFP.Polyphen2_HDIV_score dbNSFP.Polyphen2_ HDIV_pred dbNSFP.Polyphen2_HVAR_score dbNSFP.Polyphen2_HVAR_pred clinvar. CLNSIG clinvar.CLNSRC clinvar.CLNDBN keggPathway.kgDesc --order_by chr pos --header chr pos ref alt rsname 'Common SNP' gene 'gene name' 'SIFT score' 'Polyphen2 HDIV score' 'Polyphen2 HDIV pred' 'Polylyphen2 HVAR score' 'Polyphen2 HVAR pred' 'dbSNP clinical sig' 'dbSNP clinical source' 'dbSNP disease' 'pathway' "%(sample_names)s" -o my2.vcf
Is there a way to run 1 command using vtools export variant to get the desired output file with the genotype information of each sample from the vcf file and the annotation?
Thank you, Pola
On Thu, May 11, 2017 at 7:10 PM, Bo [email protected] wrote:
There is an example on export doc http://varianttools.sourceforge.net/Vtools/Export#toc6 on how to export genotype info. The command used was
% vtools export variant --samples 'sample_name like "NA128%"'
--geno_info DP_geno --format_string 'GT:DP' -o my.vcfPlease let me know if you --geno_info works for you. Note that the export command is designed to output small number of variants (thousands?) and exporting all variants from WES data tends to be extremely slow.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vatlab/VariantTools/issues/33#issuecomment-300838819, or mute the thread https://github.com/notifications/unsubscribe-auth/AbTRfA1u9dQVF7rFhhAZyF-WfVZc7x1iks5r4zLzgaJpZM4NWm7x .
This looks like a bug. I will have a look.