flux-sched icon indicating copy to clipboard operation
flux-sched copied to clipboard

provide reason a job is not scheduled

Open grondo opened this issue 8 months ago • 0 comments

A question sysadmins and developers get often is "why is job X not running?"

It seems like Fluxion could provide insigths to make this question easier to answer, perhaps even in the output of flux jobs. Some reasons that we have to manually determine now include:

  • waiting for higher priority jobs to be scheduled
  • constraints provided for resources that are currently unavailable
  • highest priority job, but waiting for resources to become available

A simple solution would be for Fluxion to return a reason or similar field in the scheduler annotations when it can provide one. This could be made available in flux jobs.

Another, perhaps longer term solution would be to provide an RPC that unveils a snapshot of the current schedule if one could be made available.

grondo avatar Jun 06 '24 13:06 grondo