bcftools
bcftools copied to clipboard
annotate ID for multi-allelic sites
I found that ID column from multi-allelic site from a source file only transferred to the first allele in the target file.
My source annotation file:
chr21 5030278 rs1258851236 C G,T . . RS=1258851236
And my target file:
chr21 5030278 . C G . . .
chr21 5030278 . C T . . .
The command I used is as follows:
bcftools annotate -c +ID -a [source file] [target file]
And I got:
chr21 5030278 rs1258851236 C G . . .
chr21 5030278 . C T . . .
Shouldn't the ID (rs1258851236) be annotated to both lines in the target file?
The version I used: bcftools_annotateVersion=1.19+htslib-1.19
The program has a limitation, when a VCF is used as the source of annotations, it can match a line only once. You'd have to split the multiallelic records into biallelics (bcftools norm -m -) or create a tab-delimited file. I believe that would work