Support Slurm-over-ssh as job manager
I am interested in using Caliban to manage jobs no HPC systems. These typically have a job manager such as Slurm, and are accessed via ssh instead of a web API. Instead of Docker, they provide e.g. Singularity.
This draft pull request is a proof of concept to demonstrate the mechanics of using ssh, Slurm, and Singularity. The code is not yet well structured, and several constants are hard-wired in for our in-house HPC system "Symmetry". At the moment, I am looking for a discussion which functions dealing with Docker images (docker) and/or clusters (platform/cluster) should be generalized, and which should be rewritten in platform/slurm.
I can use this branch to submit code to our HPC system. Other cluster functionality is still missing.
Codecov Report
Merging #43 into master will decrease coverage by
0.89%. The diff coverage is32.84%.
@@ Coverage Diff @@
## master #43 +/- ##
==========================================
- Coverage 55.56% 54.67% -0.90%
==========================================
Files 31 32 +1
Lines 3180 3316 +136
==========================================
+ Hits 1767 1813 +46
- Misses 1413 1503 +90
| Impacted Files | Coverage Δ | |
|---|---|---|
| caliban/platform/slurm/cli.py | 31.81% <31.81%> (ø) |
|
| caliban/main.py | 27.17% <50.00%> (+1.03%) |
:arrow_up: |
| caliban/docker/build.py | 32.71% <100.00%> (ø) |
|
| caliban/util/auth.py | 76.19% <0.00%> (+9.52%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 4b09430...ce28c78. Read the comment docs.