graphtyper icon indicating copy to clipboard operation
graphtyper copied to clipboard

Variant INFO/END position is before POS

Open seboyden opened this issue 3 years ago • 3 comments

Graphtyper can create variants where the END position is lower than POS. This results in a warning from bcftools concat:

Concatenating ./chr10/010000001-011000000.vcf.gz[W::vcf_parse]
INFO/END=9321046 is smaller than POS at chr10:9321047

and an error from bcftools index:

[E::hts_idx_push] Invalid record on sequence #4: end 140362273 < begin 140362275
index: failed to create index for "1000G_manta.diploidSV_graphtyper_test.vcf.gz"

The 2 problematic examples from the graphtyper output VCF:

chr4  140362275  chr4:140362275:OG  TA  G]chr11:131789668]  0  LowQUAL  ...END=140362273;...
chr10   9321047  chr10:9321047:OG   T   G]chr8:135381059]   0  LowQUAL  ...END=9321046;...

These variants were not in the svimmer output passed to graphtyper-- they were newly created by graphtyper.

seboyden avatar Aug 17 '20 19:08 seboyden

Thanks for reporting. Could you provide me a reference and the input VCF around the variant so I can reproduce the problem? The variants must be in the svimmer output - graphtyper doesn't create a new structural variants.

hannespetur avatar Aug 18 '20 08:08 hannespetur

Correction-- sorry you're right they were in the svimmer output. Three examples below. It looks like graphtyper has trimmed the REF/ALT alleles, increased POS accordingly, but then set INFO/END equal to the original input POS.

svimmer output, graphtyper input:

#CHROM  POS        ID  REF  ALT                    QUAL  FILTER  INFO
chr4    140362273  .   C    CTG]chr11:131789668]   0     .       SVTYPE=BND;...SVINSLEN=2;...
chr9    35999303   .   T    TATAA]chr6:148738900]  0     .       SVTYPE=BND;...SVINSLEN=4;...
chr10   9321046    .   T    TG]chr8:135381059]     0     .       SVTYPE=BND;...SVINSLEN=1;...

graphtyper output with END < POS:

#CHROM  POS        ID                 REF  ALT                 QUAL  FILTER         INFO
chr4    140362275  chr4:140362275:OG  TA   G]chr11:131789668]  0     LowQUAL        ...END=140362273;...
chr9    35999307   chr9:35999307:OG   A    A]chr6:148738900]   0     LowQD;LowQUAL  ...END=35999303;...
chr10   9321047    chr10:9321047:OG   T    G]chr8:135381059]   0     LowQUAL        ...END=9321046;...

seboyden avatar Aug 19 '20 20:08 seboyden

I am able to reproduce the problem, thanks for reporting. Hopefully I can push a fix for it soon.

Best, Hannes

hannespetur avatar Sep 01 '20 13:09 hannespetur