2 lines of InDel in the same position ?
The result is produced by
bcftools call
bcftools view -v indel
Are there some problems below? Why are there two different records?
chr1 660 . ATTTATGG A 284.59 PASS INDEL;IDV=11;IMF=0.244444;VDB=0.0132287;SGB=-0.651104;RPBZ=1.46665;MQBZ=-0.106989;MQSBZ=-2.0846;BQBZ=-1.80389;SCBZ=-0.813489;MQ0F=0;MQ=58;DP=22559;DP4=7922,6233,2227,4778;AN=1000;AC=431
chr1 660 . A ATTTAGGG 284.59 PASS MQ0F=0;MQ=57;INDEL;IDV=2;IMF=0.0526316;SGB=-0.379885;RPBZ=-0.1308 52;MQBZ=-1.59636;MQSBZ=-1.54725;BQBZ=-1.80916;SCBZ=-0.419435;VDB=0.0585364;DP=19822;DP4=8737,9844,40,55;AN=900;AC=1
chr1 1579 . CA C 284.59 PASS INDEL;IDV=24;IMF=0.888889;VDB=0.271009;SGB=-0.692831;RPBZ=-2.77916;MQBZ=0;MQSBZ=0;BQBZ=-0.0744034;SCBZ=-4.0762;MQ0F=0;MQ=60;DP=10882;DP4=4308,3834,1294,1109;AN=1000;AC=247
chr1 1579 . CAA CAAA,CAAAAA,CAAAA,C 284.59 PASS MQ0F=0;MQ=60;INDEL;IDV=7;IMF=0.35;VDB=0.706942;SGB=-0.616816;RPBZ=-0.872656;MQBZ=0;MQSBZ=0;BQBZ=-0.308444;SCBZ=0;DP=10374;DP4=4582,4081,740,643;AN=938;AC=65,1,2,1
The VCF spec allows you to represent variants in multiple ways.
Here are 2 goals you could try to achieve in a VCF file:
- Represent the VCF file in as few lines as possible (what you seem to want)
- normalize variants, ie "represent ... in as few nucleotides as possible" (what most people seem to want)
These 2 goals can be in conflict, as your example shows. You have to pick 1
If you really wanted it written like:
ATTTATGG A,ATTTAGGGTTTATGG
You could run it through bcftools norm --multiallelic=+
@Jerry-Wang-Dog just pinging in case you missed this.
Seems like a tradeoff not an error. Are you OK to close if you agree?