MMseqs2
MMseqs2 copied to clipboard
Some jobs failed when run several at the same time
Expected Behavior
Run mmseq search with an array of jobs.
Current Behavior
As a test, I began with an array of 5 jobs only. 2 of them failed with a different error message. When I run them alone, they work. This behaviour is similar to the issue #239
Steps to Reproduce (for bugs)
sarray -J mmseq --mail-type=ARRAY_TASKS,FAIL commandMMseqs --%=5
where commandMMseqs contains:
sbatch command_mmseq2_model.sbatch GCA_018105865.1 GCA_901001135.2
sbatch command_mmseq2_model.sbatch GCA_009193005.1 GCA_901001135.2
sbatch command_mmseq2_model.sbatch GCA_905160935.1 GCA_901001135.2
sbatch command_mmseq2_model.sbatch GCA_019095985.1 GCA_901001135.2
sbatch command_mmseq2_model.sbatch GCA_001703475.1 GCA_901001135.2
command_mmseq2_model.sbatch contains:
#!/bin/bash
#
#SBATCH -N 1 # nombre de nœuds
#SBATCH -c 20 # nombre de cœurs sur ce meme noeud
#SBATCH --mem 50G # mémoire vive pour l'ensemble des cœurs
#SBATCH -J mmseq
module load system/Miniconda3-4.7.10
module load bioinfo/mmseqs2-v13.45111
mmseqs search copies/${1}.TEs.fasta.dbm copies/${2}.TEs.fasta.dbm mapCopies/mmseq2_${1}_vs_${2}_evalue-sDefault-maxSeq50.out tmp -s 5.7 --search-type 3 --threads 20 --max-seqs 50
mmseqs filterdb mapCopies/mmseq2_${1}_vs_${2}_evalue-sDefault-maxSeq50.out mapCopies/mmseq2_${1}_vs_${2}_evalue-sDefault-maxSeq50.out.bestHit --extract-lines 1
mmseqs convertalis copies/${1}.TEs.fasta.dbm copies/${2}.TEs.fasta.dbm mapCopies/mmseq2_${1}_vs_${2}_evalue-sDefault-maxSeq50.out.bestHit mapCopies/mmseq2_${1}_vs_${2}_evalue-sDefault-maxSeq50.out.bestHit.tab
rm mapCopies/mmseq2_${1}_vs_${2}_evalue-sDefault-maxSeq50.out.*[0-9]* &
awk '{if ($3>=0.75 && $4>=300 && $12>=200) print $0}' mapCopies/mmseq2_${1}_vs_${2}_evalue-sDefault-maxSeq50.out.bestHit.tab > mapCopies/mmseq2_${1}_vs_${2}_evalue-sDefault-maxSeq50.out.bestHit.tab.filtered
rm mapCopies/mmseq2_${1}_vs_${2}_evalue-sDefault-maxSeq50.out.bestHit.tab
MMseqs Output (for bugs)
One job fails with Could not delete /work/jpeccoud/HeloiseMuller/tmp/latest!
Another job fails with Could not create symlink of tmp/14012808946536109652!
Context
I suppose some jobs try to overwrite others in tmp, as for issue #239? Since you were able to fix it for mmseq rbh, I though it should be fixable for mmseq search too?
Try giving every job an unique tmp folder (e.g., tmp_${SLURM_JOB_ID}_${SLURM_ARRAY_TASK_ID}
).
MMseqs2 also doesn't have a good way to give it a total memory limit, you can approximate a memory limit with --split-memory-limit
. This should be about 80% of the memory you want MMseqs2 to use (in your case about 40GB, so --split-memory-limit 40G
). This is relevant if other jobs are running on the same node too, as MMseqs2 will generally try to use all available memory.,
It works like that, thank you for you fast reply!