kevlar Compose control counttables for `kevlar novel` step

Compose control counttables for `kevlar novel` step

Open standage opened this issue 6 years ago • 2 comments

After counting k-mers for each control sample, we should investigate composing the counttables into a single nodetable before running kevlar novel. This should a couple of synergistic benefits.

We have a single table instead of 2 (or more), reducing time due to k-mer abundance queries
A nodetable consumes 1/8 of the size of a counttable with the same number of buckets

The cost is, of course, another pass over the "data". But it should be possible to build a nodetable directly from the underlying counttables themselves without iterating over the reads again. So "data" should be quite small and manageable.

Jun 18 '18 20:06 standage

kevlar kevlar copied to clipboard

Compose control counttables for `kevlar novel` step

kevlar
kevlar copied to clipboard