kmcp icon indicating copy to clipboard operation
kmcp copied to clipboard

TODO: save the search result into a serializing binary file for fast downstream parsing

Open shenwei356 opened this issue 10 months ago • 3 comments

The current tab-delimited search result format is redundant and inefficient for parsing in kmcp profile. So we can use a compact binary format to save the temporary result.

  1. kmcp search: a flag -b/--binary-outpu would be added to choose the output format optionally.
  2. A new command kmcp view should be added to convert the binary to plain text format.
  3. kmcp merge needs to be compatible with both plain and binary formats.
  4. kmcp profile needs to be compatible with both plain and binary formats.
#query qLen qKmers FPR hits target chunkIdx chunks tLen kSize mKmers qCov tCov jacc queryIdx
read_1 150 130 7.4626e-15 1 GCF_000007805.1 2 10 6397126 21 130 1.0000 0.0002 0.0002 0
read_2 150 130 7.4626e-15 1 GCF_000007805.1 8 10 6397126 21 130 1.0000 0.0002 0.0002 1
read_3 150 130 7.4626e-15 1 GCF_000003835.1 8 10 12115052 21 130 1.0000 0.0001 0.0001 2
read_4 150 130 7.4626e-15 1 GCF_000003835.1 3 10 12115052 21 130 1.0000 0.0001 0.0001 3

shenwei356 avatar Sep 04 '23 01:09 shenwei356