ClusterManagers.jl icon indicating copy to clipboard operation
ClusterManagers.jl copied to clipboard

Add the "auto-detect the current allocation" feature

Open DilumAluthge opened this issue 11 months ago • 3 comments

Most of the time, I think that users will only be working with a single cluster scheduler, and thus they can just use the relevant ClusterManager directly.

However, in some situations, I think that it might be useful to be able to write a single script that is agnostic to the specific cluster scheduler. In those cases, it would be nice to auto-detect which cluster scheduler is active, and then automatically use the correct ClusterManager.

This PR adds an experimental non-public addprocs_autodetect_current_scheduler() function that implements this.

Note: The addprocs_autodetect_current_scheduler() function should be run from inside an active allocation. So e.g. for Slurm, you would first get an allocation (e.g. sbatch or salloc), and then run this function inside the allocation.

DilumAluthge avatar Jan 21 '25 00:01 DilumAluthge

if we have auto-detect in this package, but certain cluster backends live in separate packages (e.g. LSF), what should user do?

I imagine we might want to further split backends into their own packages but then the auto-detect would be even less useful.

Is the long-term goal to make this package an umbrella package? If so, maybe this pkg should (optionally) depend on LSF?

Moelf avatar Feb 03 '25 16:02 Moelf

All good questions. At this point this PR is definitely very speculative, so everything is still up in the air.

At this point I'm not even sure if auto-detect is a good idea.

But if we do pursue auto-detect, then I think that yes, ClusterManagers.jl would need to take direct dependencies on LSFClusterManager.jl, SlurmClusterManager.jl, and any other external packages.

DilumAluthge avatar Feb 03 '25 17:02 DilumAluthge

Codecov Report

Attention: Patch coverage is 0% with 70 lines in your changes missing coverage. Please review.

Project coverage is 32.10%. Comparing base (a701c48) to head (2777104).

Files with missing lines Patch % Lines
src/auto_detect.jl 0.00% 70 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #234      +/-   ##
==========================================
- Coverage   37.85%   32.10%   -5.75%     
==========================================
  Files           7        8       +1     
  Lines         391      461      +70     
==========================================
  Hits          148      148              
- Misses        243      313      +70     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Feb 10 '25 00:02 codecov[bot]