hts-specs
hts-specs copied to clipboard
Canonical way to store BGEN phased probability data in a VCF
Hello,
I am trying to store phased probability data ingested from a BGEN file in a VCF. The BGEN format stores these probabilities per haplotype per allele. From what I can tell, the obvious VCF candidate fields (GL, PL, etc.) are instead in "canonical order", which the BGEN format calls "colex order" and uses for unphased probability data. As these deal with unordered combinations of alleles, they cannot record the phased probability data without loss of the phase information. Could the PS/PQ fields somehow be used to retain this information? I am mostly interested in the diploid case.