slurm-drmaa
slurm-drmaa copied to clipboard
drmaa.errors.Conflicting in job submission
Dear team,
We have the following system environment:
$ cat /etc/redhat-release CentOS Linux release 7.9.2009 (Core) $ sbatch --version slurm 23.02.2 $ which sbatch /opt/slurm/cluster/ibex/install-v2/RedHat-7/bin/sbatch
We compiled the latest version of slurm-drmaa as follows:
$ git clone --recursive https://github.com/natefoo/slurm-drmaa.git $ cd slurm-drmaa/ $ autoconf $ ./autogen.sh --with-slurm-inc=/opt/slurm/cluster/ibex/install-v2/RedHat-7/include --with-slurm-lib=/opt/slurm/cluster/ibex/install-v2/RedHat-7/lib --prefix=/ibex/sw/csi/slurm-drmaa/1.2.0-dev/el7.9_gnu6.4.0/install $ ./configure --prefix=/ibex/sw/csi/slurm-drmaa/1.2.0-dev/el7.9_gnu6.4.0/install --with-slurm-inc=/opt/slurm/cluster/ibex/install-v2/RedHat-7/include --with-slurm-lib=/opt/slurm/cluster/ibex/install-v2/RedHat-7/lib $ make $ make install $ pip install drmaa
We defined the following environment variables:
export DRMAA_LIBRARY_PATH=/ibex/sw/csi/slurm-drmaa/1.2.0-dev/el7.9_gnu6.4.0/install/lib/libdrmaa.so.1 export PATH=/ibex/sw/csi/slurm-drmaa/1.2.0-dev/el7.9_gnu6.4.0/install/bin:$PATH export LD_LIBRARY_PATH=/ibex/sw/csi/slurm-drmaa/1.2.0-dev/el7.9_gnu6.4.0/install/lib:$LD_LIBRARY_PATH
When I'm trying DRMAA python binding to submit a job, it's failed and here are the summary of error:
Creating job template Traceback (most recent call last): File "runme.py", line 23, in
main() File "runme.py", line 16, in main jobid = s.runJob(jt) File "/ibex/sw/csi/slurm-drmaa/1.2.0-dev/el7.9_gnu6.4.0/miniconda2/lib/python2.7/site-packages/drmaa/session.py", line 314, in runJob c(drmaa_run_job, jid, sizeof(jid), jobTemplate) File "/ibex/sw/csi/slurm-drmaa/1.2.0-dev/el7.9_gnu6.4.0/miniconda2/lib/python2.7/site-packages/drmaa/helpers.py", line 302, in c return f(*(args + (error_buffer, sizeof(error_buffer)))) File "/ibex/sw/csi/slurm-drmaa/1.2.0-dev/el7.9_gnu6.4.0/miniconda2/lib/python2.7/site-packages/drmaa/errors.py", line 151, in error_check raise _ERRORScode - 1 drmaa.errors.ConflictingAttributeValuesException: code 15: drmaa_join_files is set and output file is not given
Here is the Python code:
cat runme.py #!/usr/bin/env python
import drmaa import os
def main(): """ Submit a job. Note, need file called sleeper.sh in current directory. """ with drmaa.Session() as s: print('Creating job template') jt = s.createJobTemplate() jt.remoteCommand = os.path.join(os.getcwd(), 'sleeper.sh') jt.args = ['42', 'Simon says:'] jt.joinFiles=True
jobid = s.runJob(jt) print('Your job has been submitted with ID %s' % jobid) print('Cleaning up') s.deleteJobTemplate(jt)
if name=='main': main()
Please advise me the solution. Are we missing something? Your suggestion is much appreciated!
Thanks and Regards, Naga
Dear all, We have some workaround:
export DRMAA_LIBRARY_PATH=lib.slurm-22.05.6/libdrmaa.so.1
export LD_LIBRARY_PATH=lib.slurm-22.05.6:$LD_LIBRARY_PATH
This solution, helped us for launching job using Slurm-DRMAA!
Thanks to @wickhagj for fixing the bug in the Slurm-drmaa (https://github.com/natefoo/slurm-drmaa/commit/1f5db98cd788677e7a94b93cb56bc00266539fe2)
The modifications are included in the master repo.
If you define a job output file (jt.outputPath = "/foo"
), does this avoid the error?