cyvcf2 icon indicating copy to clipboard operation
cyvcf2 copied to clipboard

Multiallelic VCF with annotation in INFO field doesn't query all available infromation

Open KalinNonchev opened this issue 1 year ago • 3 comments

Hello,

I noticed that in the case of annotated multiallelic VCF where annotation for each alternative is stored in the INFO field, variant.INFO.get("annotation") would only pick the first occurrence. Example:

1 123 A T,TA ... annotation:0.5.. annotation:0.1

variant.INFO.get("annotation") would return only 0.5 instead of [0.5, 0.1]

Best

KalinNonchev avatar Mar 12 '24 09:03 KalinNonchev

Hi, can you show the actual line from the VCF? the INFO field is key=value, so you can't have the same key twice. Format fields are separated by : as you have shown above.

brentp avatar Mar 12 '24 09:03 brentp

That would be an example with multiple gnomad312_AF annotations... It looks like the annotations are separated by "ALLELE_END;"

chr1\t10340\t.\tT\tG,C\t;ANNOVAR_DATE=2020-06-08;gnomad312_AF=0.0014;ALLELE_END;ANNOVAR_DATE=2020-06-08;gnomad312_AF=0.0509;ALLELE_END\t

KalinNonchev avatar Mar 12 '24 09:03 KalinNonchev

That is not VCF format. It might be a text format output by annovar?

brentp avatar Mar 12 '24 10:03 brentp