khmer
khmer copied to clipboard
Command: k-mer abundance histogram
We have downloaded khmer 2.1.1 version. Now we want to benchmark for the full histogram for k-mer abundance,
Is the following command correct for obtaining 'the full k-mer abundance histogram" to be run on the dataset like human HS3 for the benchmark.
./abundance-dist-single.py -k 25 -T 12 input.fastq output_histo
Hi @taranglute.
Your command looks mostly correct, although:
- It's not typical to run commands from the
scripts/
directory, so the./
prefix may not make sense. Did you follow the latest installation instructions? If so, you should be able to execute theabundance-distance-single.py
from any directory. - The command uses a constant amount of memory, and by default it is a very small amount of memory. For human whole-genome shotgun data, note that dozens of gigabytes of memory must be available if you want accurate k-mer abundances. For example, if you want to allocate 32 gigabyes computing the k-mer counts, you can add
-M 32G
to the command.