sourmash
sourmash copied to clipboard
describe sourmash estimates and their correlations with read mapping rates
documentation updates needed:
we now have a lot of good evidence that f_match correlates with the fraction of genome that will be covered by mapped reads, per the sourmash gather/minimum metagenome covers preprint. It's a little harder to talk about what that means on the metagenome side tho.
and, per https://github.com/sourmash-bio/sourmash/issues/1833, we should mention that the sourmash gather
abundances (median_abund
, average_abund
) are correlated with the fraction of reads that will map to that genome; and that sourmash gather
summary information for weighted fraction correlations with % of total reads that will map (see https://github.com/sourmash-bio/sourmash/issues/1818).
ref https://github.com/dib-lab/genome-grist/issues/167
ref https://github.com/sourmash-bio/sourmash/issues/2170