SCIPhI
SCIPhI copied to clipboard
Print PL field according to VCF specification
Hi guys,
could you please consider aligning your VCF output to the VCF spec, which requires that PL field contains likelihoods for all possible genotypes given the set of alleles defined in the REF and ALT fields
(p. 5). In other words, for a biallelic site, PL field must contain three values that provide the likelihoods of REF/REF
, REF/ALT
, ALT/ALT
.
I guess this might be related to #10, but here it's about fixing format violation, not about providing additional information (i.e., if you only have likelihood for one of the genotypes and others are assumed to be zero, you can print something like 0/1:2:9:.,60,.
)
Thanks in advance!
I don't think the PL field should be used at all. See https://github.com/cbg-ethz/SCIPhI/issues/22#issuecomment-594402727
Oh I see, so then it should probably be changed to PP
field, which is phred-scaled posterior genotype probability, according to the VCF v4.3 spec.
Cool! Yes, PP looks like the right Tag.