smartdispatch
smartdispatch copied to clipboard
An easy to use job launcher for supercomputers with PBS compatible job manager.
Now we have `qdel ` which enables us to terminate a specified job, and `qdel all` to kill all jobs. But when we have several job batchs running, each of...
Add the `-d`, `--duplicate` command line argument that will add a #PBS -t 0-4 to give the option to launch duplicate of the same command. See: http://docs.adaptivecomputing.com/torque/3-0-5/2.1jobsubmission.php#jobarrays It is also...
Find a cleaner way to test the `open_with_lock` function. Right now we rely on causing race conditions using `time.sleep`. The procedure is as follow: 1. a file is locked in...
Add a command line option to specify with which RAPID a job should be run. Double check on the compute canada website that this is the proper way to name...
Most of this really need to be tought in parallel with the JobManager #91. - [ ] Only give the base path to the command manager and let it handle...
A viewer would be really nice to view, delete, check logs of runninga and stopped jobs.
This will allow us to add feature such as #10, #93 and more. Status of a batch can be (Running, Done, Stopped) Here an Idea of what the option tree...
Create a JobManager in the style of the CommandManager that include probably the functionalities of job_generator/job_generator_factory and some stuff from scripts/smart_dispatch.py. This might be a bigger task than it seems,...
As eluded to in #86 add the possibility to manage queues. - queue - info (QNAME | All) - add QNAME CORESPERNODE GPUSPERNODE RAMPERNODE MAXWALLTIME DEFAULTMODULES NODESINQ MINPPN - delete...
If the user code does not have built-in resume the current behaviour will be a problem. The default behaviour of the worker should be to run one command and then...