pyslurm
pyslurm copied to clipboard
Submittion of a batch job will be failed when argument "work_dir" contains a "_"
Details
- Slurm Version: 20.02.5
- Python Version: 3.10.8
- Cython Version: 0.29.34
- PySlurm Branch: 20-02-5
- Linux Distribution: Linux version 3.10.0-1160.el7.x86_64 CentOS Linux release 7.9.2009
UPDATE
This bug seems not only caused by values of slurm_job dict. I have got the same error when deleted the "work_dir" from the dict. Maybe it's something to submit job in Jupyter Lab? I do not know.
Issue
When attempting to submit a batch job using the job().submit_batch_job function and specifying a "work_dir" key with values containing underscores (_), the job gets submitted but immediately fails. Upon checking the submitted job using the job().find_id function, I discovered that the "work_dir" attribute was encoded as garbled text such as "wly�U". However, when I resubmitted the job with the underscores removed from the work_dir`, the issue did not reoccur. I suspect this might be due to replacing "_" by "-" when call the SLURM interface.
An example which reproduces this bug: Job1 = {'wrap': 'echo a;sleep 15; echo b, 'job_name': 'test', 'partition': 'all', 'ntasks': 1, 'cpus_per_task': 1, 'work_dir': '/home/boo/slurm_jobs'} job().submit_batch_job(Job1)
And an example which works well: Job2 = {'wrap': 'echo a;sleep 15; echo b, 'job_name': 'test', 'partition': 'all', 'ntasks': 1, 'cpus_per_task': 1, 'work_dir': '/home/boo/slurmjobs'} job().submit_batch_job(Job2)
Hi
you are probably seeing a similar issue as mentioned in #260
In newer versions of pyslurm (starting with 21.08), the Job-Submission API was substantially reworked (see the docs here), and the pyslurm.job class has been declared deprecated.
Since that new API is not available for 20.2 yet, I can try to backport it. But it may take some time due to potential changes that have been introduced over the years in newer slurm versions.