tsinfer
tsinfer copied to clipboard
Allow quality metrics in the VCF file to affect the per-sample mismatch function
@benjeffery had the great idea that we could use quality scores in the VCF (or FASTQ, or BAM) file to change the mismatch probabilities during match_samples. We could even use them when generating ancestors too.
For match_samples, this will require some re-plumbing to allow per-sample mismatch functions (RateMaps, or whatever we are calling them).
This maps pretty naturally on to the per-site array paradigm, so we can probably build the appropriate array given the input data (genotypes and base qualities), and use that as input.
We want to use sgkit to allow us access the quality scores, so I'm pushing this to 0.3