ert
ert copied to clipboard
Verify that the selected queue type can be used
Describe the bug
Bad traceback in case the cluster runner is ill configured. It would be nice if ert could check that the selected driver can be used before trying to submit jobs.
To reproduce Steps to reproduce the behaviour:
- Connect to equinor azure node
-
ert gui my_config.ert
- Run experiment (IES/Smoother/ESMDA/Test)
- …
Expected behaviour A better error message
Screenshots The following will be printed in terminal the amount of times we send qsub (so at least once for each realisation)
Command "/opt/pbs/bin/qsub -rn -Nstress.ert-1 -q short -o /dev/null -e /dev/null -l select=1:ncpus=1" failed with exit code 160, output: "<empty>", and error: Unknown Host.
qsub: cannot connect to server Please (errno=15008)"
Exception in scheduler task job-1_task: Command "/opt/pbs/bin/qsub -rn -Nstress.ert-1 -q short -o /dev/null -e /dev/null -l select=1:ncpus=1" failed with exit code 160, output: "<empty>", and error: "Unknown Host.
qsub: cannot connect to server Please (errno=15008)"
Traceback: Traceback (most recent call last):
File "/usr/lib64/python3.8/asyncio/events.py", line 81, in _run
self._context.run(self._callback, *self._args)
File "/prog/komodo/2024.06.rc1-py38-rhel8/root/lib64/python3.8/site-packages/_ert/async_utils.py", line 53, in _done_callback
raise exc
File "/prog/komodo/2024.06.rc1-py38-rhel8/root/lib64/python3.8/site-packages/ert/scheduler/job.py", line 131, in run
await self._submit_and_run_once(sem)
File "/prog/komodo/2024.06.rc1-py38-rhel8/root/lib64/python3.8/site-packages/ert/scheduler/job.py", line 99, in _submit_and_run_once
await self.driver.submit(
File "/prog/komodo/2024.06.rc1-py38-rhel8/root/lib64/python3.8/site-packages/ert/scheduler/openpbs_driver.py", line 214, in submit
raise RuntimeError(process_message)
RuntimeError: Command "/opt/pbs/bin/qsub -rn -Nstress.ert-1 -q short -o /dev/null -e /dev/null -l select=1:ncpus=1" failed with exit code 160, output: "<empty>", and error: "Unknown Host.
qsub: cannot connect to server Please (errno=15008)"
Environment
- ERT/Komodo release: Any
- Remote/HPC execution involved: yes
Additional context The reason is that the default pbs server cannot be used, and it is expected that the user sets the server themselves.
$ cat /etc/pbs.conf
PBS_EXEC=/opt/pbs
PBS_SERVER=Please set SERVER_NAME in your environment