varianttools
varianttools copied to clipboard
Stop combining variants in `vtools export`
I have those two variants in my vtools variant database:
4 106156653 T C Scan1,Scan2 ....,.,.,.,....
4 106156653 T G Scan1,Scan2 ....,.,.,.,....
So, when I export it to vcf with the following command
vtools export variant --format $SCRIPTS/myvcf.fmt --header CHROM POS ID REF ALT QUAL FILTER INFO --var_info callers genotypes --output ./Variants_raw.vcf
These variants will be combined to a multi-allelic entry like this:
4 106156653 . T C,G . PASS callers=[u'Scan1|Scan2', u'Scan1|Scan2'];genotypes=[u'....|.|.|.|....', u'....|.|.|.|....']
This is very bad – for one, because the further processing gets corrupted by the MAV and these strange [] arrays are also difficult to process. I would prefer it to output just one line per each variant, just as it would be done via vtools export.
Surely there will be a nice little workaround for this, I assume… But I seem not to be able to find it already…
So, can you help me with this another time?
Changing
export_by=chr,%(pos)s,%(ref)s
to
export_by=chr,%(pos)s,%(ref)s,%(alt)s
in vcf.fmt
[format description]
description=Import vcf
variant=chr,%(pos)s,%(ref)s,%(alt)s
genotype=%(geno)s
variant_info=%(var_info)s
genotype_info=%(geno_info)s
# variants with identical chr,pos,ref will be collapsed.
export_by=chr,%(pos)s,%(ref)s
should solve the problem.