KMC
KMC copied to clipboard
Fast and frugal disk based k-mer counter
When filtering reads, with -ci0 for the reads (i.e. keep all reads, even ones with no kmers from the db), no hard-masking at all seems to happen. @marekkokot
Hi, I have a set of individuals I want to compute jacquard distances for. I've produced a kmc database for each individual. One way I thought of to compute these...
Hi, is there any fast way that kmc just dump kmers with certain prefixes, other than looking through the whole kmc database? thanks chunlin
Right now, a base is hard masked to N if at least 1 of the kmers it's in is "invalid" (has count less than -ci). Can you make that a...
The help message for hard masking says it masks "invalid" kmers. From other parts of kmc, I thought "invalid" means "count under -cx _or_ over -cx". But looking at the...
When constructing a kmer db from reads, sometimes I'm only interested in kmers that appear in another kmer db (e.g. kmers that have appeared in known genomes of taxon of...
Hi, I have noticed that kmc tool (latest release) generates wrong k-mers for long (>= 32) reads and short reads (~5) as well. For example: AACCACAGATATCTTTAACCAGGATACCATAGAC the following should generate...
@marekkokot Is there a test suite you use to verify correctness of kmc and kmc_tools? If there is, could it be checked into github?
The cutoff_max for a simple binary operation defaults to the higher cutoff_max of the operands. But for the union operation, the default way to combine the counters is SUM. This...
check #90 for details.