Mash icon indicating copy to clipboard operation
Mash copied to clipboard

add support for multi-fasta files give (as mixture) in screen

Open ndaniel opened this issue 4 years ago • 0 comments

As it is now it is not feasible to run mash screen with 100K mixtures because one needs to launch mash 100K times. mash dist has support for multi-fasta files but does not have winner-take-all strategy.

Therefore it would be great if ``mash screen` would support multi-fasta files as input such that

mash screen queries.msh 1.fa
mash screen queries.msh 2.fa
mash screen queries.msh 3.fa
mash screen queries.msh 4.fa

could be given as

mash screen --multi-fasta-mixtures queries.msh total.fa

where

cat 1.fa 2.fa 3.fa 4.fa > total.fa

and files 1.fa, 2.fa, 3.fa, and 4.fa contain each one sequence.

Basically, adding support for this feature would make mash screen behave like "mash dist with winner-take-all strategy". Therefore this could be implemented alternatively also by adding support for winner-take-all strategy to mash dist (ie. mash dist --winner-take-all).

ndaniel avatar Nov 23 '19 09:11 ndaniel