nomad icon indicating copy to clipboard operation
nomad copied to clipboard

prototype of custom resource scheduling

Open tgross opened this issue 3 months ago • 0 comments

Platforms other than Linux, especially non-Unix platforms, may have a different view of resources that don't model well as resources.cpu or resources.memory. And users with unusual deployment environments have asked about the possibility of scheduling resources specific to those environments.

This PR is a prototype of the kind of interface we might want to be able to support. Nodes get a client.custom_resource block where they can define a resource the node will make available in its fingerprint. Resources are one of five types, designed to make it possible to model all our existing resources as custom resources:

  • ratio: The resource is used as a value relative to other tasks with the same resource. An example of this would be Linux cpu.weight as implemented by a systemd unit file.
  • capped-ratio: Like a ratio, except all the quantity used by tasks has a maximum total value. An example of this would be Linux cpu.weight as currently implemented in Nomad, where the "cap" is derived from the total Mhz of CPUs on the host.
  • countable: The resource has a fixed but fungible amount. An example of this would be memory allocation (ignoring NUMA), where there's a certain amount of memory available on the host and it's "used up" by allocations, but we don't care about identity of the individual blocks of memory.
  • dynamic: The resource has a fixed set of items with exclusive access, but the job doesn't care which ones it gets. An example of this would be Nomad's current dynamic port assignment or resource.cores.
  • static: The resource has a fixed set of items with exclusive access, and the job wants a specific one of those items. An example of this woul dbe Nomad's current static port assignment.

We can use this prototype to anchor discussions about how we might implement a set of custom resource scheduling features but it's nowhere near production-ready. I've included enough plumbing to implement these five resource types, fingerprint them on clients via config files, and schedule them for allocations. What's not included, all of which we'd want to solve in any productionized version of this work:

  • Support for exposing the custom resource allocation to a task driver. We'd need to thread custom resources into the protobufs we send via go-plugin and then the user's task driver would need to consume that data.
  • Support for preemption
  • Support for quotas
  • Support for dynamically adding custom resources to a node

Ref: https://github.com/hashicorp/nomad/issues/1081

tgross avatar Nov 19 '25 19:11 tgross