mashtree
mashtree copied to clipboard
Unexpected mashtree parameters for bootstrap replicates
Describe the bug
It looks like some mashtree
parameters such as --kmerlength
and --sketch-size
are not propagated to mash sketch
when running the bootstrap replicates.
To Reproduce
Steps to reproduce the behavior:
I suspected there was an issue when I got an error when running mashtree_bootstrap.pl
. The error itself was my fault and resolved, but I noticed messages like this in the error log:
mashtree: mashSketch(TID1): ERROR running mash sketch -S 1453011824 -k 21 -s 10000 -o /var/tmp/MASHTREE_BOOTSTRAP.vlWjIT/3/files.fa files.fa 2>&1!
This message was unexpected because I had specified --kmerlength 22 --sketch-size 1000000
. I confirmed from the log that these parameters looked fine for the initial mashtree
run:
mashtree --outmatrix /var/tmp/MASHTREE_BOOTSTRAP.9F2rzJ/observeddistances.tsv.tmp --tempdir /var/tmp/MASHTREE_BOOTSTRAP.9F2rzJ/observed --numcpus 48 --genomesize 100000 --kmerlength 22 --mindepth 1 --sketch-size 1000000 <my input fasta files > /var/tmp/MASHTREE_BOOTSTRAP.9F2rzJ/observed.dnd.tmp
Expected behavior
I would expect the mashtree
runs with the different seeds for bootstrapping would use the same parameters
Additional context
I may be misunderstanding the code or log files, but I think the issue could be due to this snippet of code:
https://github.com/lskatz/mashtree/blob/master/bin/mashtree_bootstrap.pl#L160-L172
The parameter $mashtreeOptions
doesn't appear to be used?
Thank you very much for making this great software!
I experienced this too, but as described in Issue #63 this behaviour is fixed by putting --kmerlength
and other parameters relevant to the sketch after --
double dashes. Something like:
mashtree_bootstrap.pl --reps 1000 --numcpus 48 input_files/* -- --min-depth 0 --kmerlength 5 > out.tre
It then ran exactly as expected for me!