flux-sched
flux-sched copied to clipboard
provide reason a job is not scheduled
A question sysadmins and developers get often is "why is job X not running?"
It seems like Fluxion could provide insigths to make this question easier to answer, perhaps even in the output of flux jobs
.
Some reasons that we have to manually determine now include:
- waiting for higher priority jobs to be scheduled
- constraints provided for resources that are currently unavailable
- highest priority job, but waiting for resources to become available
A simple solution would be for Fluxion to return a reason
or similar field in the scheduler annotations when it can provide one. This could be made available in flux jobs
.
Another, perhaps longer term solution would be to provide an RPC that unveils a snapshot of the current schedule if one could be made available.