spacepharer icon indicating copy to clipboard operation
spacepharer copied to clipboard

Too many FASTA files with spacers

Open aziele opened this issue 1 year ago • 2 comments

Hi,

I'm encountering an issue while attempting to run the command:

spacepharer easy-predict spacers/*.fna targetSetDB predictions.tsv temp/

I have approximately 35,000 FASTA files containing spacers, each corresponding to a different bacterial genome. However, when executing the command, I receive the error message:

sh: 1: spacepharer: Argument list too long

The argument list is exceeding its capacity. Is there a workaround for this limitation? Any advice or solution would be greatly appreciated.

aziele avatar May 06 '24 06:05 aziele

I pushed a change that should allow you to either pass a directory to easy-predict or a list of file paths to spacer files. The latter needs to have the file ending.tsv.

spacepharer easy-predict spacers targetSetDB predictions.tsv temp/ --file-include ".fna$"

You can download precompiled binaries here: https://mmseqs.com/spacepharer

If you don't want to update to a pre-release version, you should be able to do the following:

tar -cvf spacers.tar spacers/
spacepharer tar2db spacers.tar spacers_fna_db
spacepharer createdb spacers_fna_db spacers_db
spacepharer createsetdb spacers_db spacers_set_db tmp --extractorf-spacer 1
spacepharer predictmatch spacers_set_db targetSetDB targetSetDB_rev predictions.tsv tmp

milot-mirdita avatar May 08 '24 14:05 milot-mirdita

Thank you for the update. Everything's working smoothly now.

aziele avatar May 09 '24 08:05 aziele