ClusterManagers.jl
ClusterManagers.jl copied to clipboard
Comprehensive tests
One of the issues (and one that became even more apparent during the 1.0 transistion) is that it is really hard to test this package. Without CI development is slow since we are likely to break use-cases that we can't test.
Ideally we could use docker environments to instantiate a "minimal" cluster environment in which we then can run tests.
As an example see:
- https://github.com/giovtorres/slurm-docker-cluster
You may be interested to borrow from the CI set-up (based on docker-compose
) we have in dask-jobqueue (deploy dask on HPC clusters). For now, we have CI for SGE
, PBS
and SLURM
.
Oh that is rather interesting! I see you are running it on travis.
I managed to set up Travis testing on SlurmTools.jl using the docker images built by PySlurm: https://github.com/simonbyrne/SlurmTools.jl/blob/master/.travis.yml
I was also able to setup testing infrastructure on travis for SlurmClusterManager using docker-compose to create a small cluster. It seems to work pretty well.
It would be nice to unite these efforts here. We all need cluster managers and yet the current state of affairs is quite fragmented for a reliable experience with testing, etc.
Status of this? Is there still only slurm testing? Seems like PBS and SGE have been broken for a while: https://github.com/JuliaParallel/ClusterManagers.jl/issues/179
WIP PR for this: https://github.com/JuliaParallel/ClusterManagers.jl/pull/193