bcftools
bcftools copied to clipboard
some problems in using bcftools norm command
I found that some duplicated variants appeared while normalize some big indel while using bcftools norm It failed to combine the two variants as below
variant1: chr: bp: TTCACTCATTCATCACTAACCCATTGACTCACTCACTCAATCACTCATTCACTCACTCAATCACTCATTCACTCATTAACCCATTGACTCACTCCCTCAATCACTCATTCACTCACTAACCCATTGACTCACTCACTCATTCACTCACTAACCCATTGACTCACTCAATCACTCATTCATTCACTAACCCATTGACTCACTCACTCGATCACTCATTCACTCACTCATTCACTCATTCACTAACCCATTCAGTCACTCAA:T AC=5;DP=5 1/1:....
variant2: chr: bp+50: CTCAATCACTCATTCACTCATTAACCCATTGACTCACTCCCTCAATCACTCATTCACTCACTAACCCATTGACTCACTCACTCATTCACTCACTAACCCATTGACTCACTCAATCACTCATTCATTCACTAACCCATTGACTCACTCACTCGATCACTCATTCACTCACTCATTCACTCATTCACTAACCCATTCAGTCACTCAATCACTCATTCATCACTAACCCATTGACTCACTCACTCAATCACTCATTCACTCAC:C AC=10;DP=1 0/1:....
after normalization, bcftools -norm -f GRCh37.fa ID of variant2 (chr:bp:ref:alt) will changed to the id of variant 1 while the INFO and FORMAT are still sperate. In other words, recorded two variants with same id(chr:bp:ref:alt) while different "INFO and FORMAT" information as below:
variant1: chr: bp: TTCACTCATTCATCACTAACCCATTGACTCACTCACTCAATCACTCATTCACTCACTCAATCACTCATTCACTCATTAACCCATTGACTCACTCCCTCAATCACTCATTCACTCACTAACCCATTGACTCACTCACTCATTCACTCACTAACCCATTGACTCACTCAATCACTCATTCATTCACTAACCCATTGACTCACTCACTCGATCACTCATTCACTCACTCATTCACTCATTCACTAACCCATTCAGTCACTCAA:T AC=5;DP=5 1/1:....
variant2: chr: bp: TTCACTCATTCATCACTAACCCATTGACTCACTCACTCAATCACTCATTCACTCACTCAATCACTCATTCACTCATTAACCCATTGACTCACTCCCTCAATCACTCATTCACTCACTAACCCATTGACTCACTCACTCATTCACTCACTAACCCATTGACTCACTCAATCACTCATTCATTCACTAACCCATTGACTCACTCACTCGATCACTCATTCACTCACTCATTCACTCATTCACTAACCCATTCAGTCACTCAA:T AC=10;DP=1 0/1:....
Please provide a small test case (VCF) to reproduce the problem and make sure you are using the latest version of bcftools.