flux-core icon indicating copy to clipboard operation
flux-core copied to clipboard

job-manager: implement RPC to partially replace sched.resource-status

Open garlick opened this issue 1 year ago • 1 comments

Problem: In flux-framework/flux-sched#1137 it was suggested that the job manger could take over duties from the sched.resource-status RPC to avoid delays when the scheduler is busy, since all the alloc/free requests pass through the job manager anyway.

Currently the response consists of three R objects, all, down, and allocated. I think all and down can be obtained from resource.status, so the job manager would just need to provide allocated.

Here is an example sched.resource-status response payload from a small test system:

{
  "all": {
    "version": 1,
    "execution": {
      "R_lite": [
        {
          "rank": "0-7",
          "children": {
            "core": "0-3"
          }
        }
      ],
      "nodelist": [
        "picl[0-7]"
      ],
      "properties": {
        "4g": "2,5",
        "8g": "0",
        "admin": "0",
        "batch": "3-7",
        "debug": "1-2",
        "testproperty": "7"
      },
      "starttime": 0,
      "expiration": 0
    }
  },
  "down": null,
  "allocated": {
    "version": 1,
    "execution": {
      "R_lite": [
        {
          "rank": "0",
          "children": {
            "core": "0-3"
          }
        }
      ],
      "nodelist": [
        "picl0"
      ],
      "properties": {
        "8g": "0",
        "admin": "0"
      },
      "starttime": 0,
      "expiration": 0
    }
  }
}

garlick avatar Mar 07 '24 20:03 garlick

For completeness, another idea discussed was to add a separate service that would collect resource information from the resource module and the job-manager and give consumers a single endpoint to query all status information for resources instead of having multiple RPC responses combined like we need now.

However, moving just the sched.resource-status RPC out of the scheduler would be a nice step forward.

grondo avatar Mar 07 '24 21:03 grondo

Closed by #5796?

grondo avatar Mar 27 '24 16:03 grondo