graphtyper
graphtyper copied to clipboard
Variant INFO/END position is before POS
Graphtyper can create variants where the END position is lower than POS. This results in a warning from bcftools concat:
Concatenating ./chr10/010000001-011000000.vcf.gz[W::vcf_parse]
INFO/END=9321046 is smaller than POS at chr10:9321047
and an error from bcftools index:
[E::hts_idx_push] Invalid record on sequence #4: end 140362273 < begin 140362275
index: failed to create index for "1000G_manta.diploidSV_graphtyper_test.vcf.gz"
The 2 problematic examples from the graphtyper output VCF:
chr4 140362275 chr4:140362275:OG TA G]chr11:131789668] 0 LowQUAL ...END=140362273;...
chr10 9321047 chr10:9321047:OG T G]chr8:135381059] 0 LowQUAL ...END=9321046;...
These variants were not in the svimmer output passed to graphtyper-- they were newly created by graphtyper.
Thanks for reporting. Could you provide me a reference and the input VCF around the variant so I can reproduce the problem? The variants must be in the svimmer output - graphtyper doesn't create a new structural variants.
Correction-- sorry you're right they were in the svimmer output. Three examples below. It looks like graphtyper has trimmed the REF/ALT alleles, increased POS accordingly, but then set INFO/END equal to the original input POS.
svimmer output, graphtyper input:
#CHROM POS ID REF ALT QUAL FILTER INFO
chr4 140362273 . C CTG]chr11:131789668] 0 . SVTYPE=BND;...SVINSLEN=2;...
chr9 35999303 . T TATAA]chr6:148738900] 0 . SVTYPE=BND;...SVINSLEN=4;...
chr10 9321046 . T TG]chr8:135381059] 0 . SVTYPE=BND;...SVINSLEN=1;...
graphtyper output with END < POS:
#CHROM POS ID REF ALT QUAL FILTER INFO
chr4 140362275 chr4:140362275:OG TA G]chr11:131789668] 0 LowQUAL ...END=140362273;...
chr9 35999307 chr9:35999307:OG A A]chr6:148738900] 0 LowQD;LowQUAL ...END=35999303;...
chr10 9321047 chr10:9321047:OG T G]chr8:135381059] 0 LowQUAL ...END=9321046;...
I am able to reproduce the problem, thanks for reporting. Hopefully I can push a fix for it soon.
Best, Hannes