KMC
KMC copied to clipboard
filter reads by kmers clustered together
Add option to keep only reads containing a stretch of M adjacent kmers at least N of which are in the database. This permits more flexible filtering than just using longer kmers.
The filtering could also be optimized, to quickly rule out reads that don't match (which is usually most reads): e.g. check every other kmer first, and if none are in the databsae, that may be enough to rule out the read. @marekkokot