Mash
Mash copied to clipboard
Ignore over-occuring kmers?
Would an option to ignore over-occurring kmers make mash more robust against large repeat families and multi-copy plasmids?
mash estimates the coverage in -r mode, and it uses -m for a min freq, but maybe 2*est_cov would be a good max freq?
eg. -M 2
would ignore kmers with freq > 2*est_cov
I've just realised Finch does something like this already https://github.com/onecodex/finch-rs