make_lastz_chains icon indicating copy to clipboard operation
make_lastz_chains copied to clipboard

The number of tasks submitted by SLURM exceeded the limit

Open aaannaw opened this issue 7 months ago • 16 comments

Hello, professor I was running the pipeline to align my genome assemblies with mm10 genome via slurm: ./make_chains.py target query mm10.fasta Bsu.softmask.fasta --pd mm-Bsu -f --chaining_memory 30 --cluster_queue pNormal --executor slurm --nextflow_executable /data/01/user157/software/bin/nextflow and I encounter an error after running the command for several minnites:

[fe/5bafab] NOTE: Error submitting process 'execute_jobs (206)' for execution -- Execution is retried (3)
[ff/d8223b] NOTE: Error submitting process 'execute_jobs (212)' for execution -- Execution is retried (3)
[4a/34ad45] NOTE: Error submitting process 'execute_jobs (209)' for execution -- Execution is retried (3)
 ERROR ~ Error executing process > 'execute_jobs (91)'                                                                                           
Caused by:
Failed to submit process to grid scheduler for execution                                                                                                                                                                                            Command executed:                                                                                                                               
sbatch .command.run                                                                                                                                                                                                                                                                           Command exit status:  
1                                                                                                      
Command output: 
sbatch: error: QOSMaxSubmitJobPerUserLimit                                                             
sbatch: error: Batch job submission failed: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits)                                                                                                                                                
Work dir: 
/data/01/p1/user157/software/make_lastz_chains/mm-Bsu/temp_lastz_run/work/23/a09dba9e82d536f1f39b26de92d7d0
 Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`                                                                                                                                        
 -- Check '.nextflow.log' file for details

The error is because our server limits the maximum number of submitted tasks per person to 100 and I find the default chunk size will generated 1955 jobs, which is well over 100 limited jobs. be3aefc01c900c8caee0c39593d377c Thus, I attempted to increase the chunk size like this: ./make_chains.py target query mm10.fasta Bsu.softmask.fasta --pd mm-Bsu -f --chaining_memory 30 --cluster_queue pNormal --executor slurm --nextflow_executable /data/01/user157/software/bin/nextflow --seq1_chunk 500000000 --seq2_chunk 500000000. However, this still generated 270 jobs as following. a8e97638859f6f9db82b67f20660b4f This is unbelievable. I checked and found that when the number of scaffolds is too much, up to 100 scaffolds are put in a chunk for comparison, even though they don't add up to the chunk size. I don't know what's going on here. Anyway, I think there should exist the method, without increasing the chunk size (as I understand that increasing the chunk size would increases the runtime), that allow me to submit multiple lines command per task, which would guarantee that I would complete 1955 commands with less than 100 tasks submitted! Looking forward with your suggestions! Best wishes! Na Wan

aaannaw avatar Jul 11 '24 09:07 aaannaw