nomad icon indicating copy to clipboard operation
nomad copied to clipboard

Add outgoing network bandwidth ressources option

Open RSWilli opened this issue 7 months ago • 2 comments

Proposal

Under the ressources section of the job specification, it may be useful to add an upstream (maybe even downstream) bandwidth requirement option

Use-cases

I am running jobs that continually stream an audio signal to different upstream servers. Each job consumes some CPU and RAM ressources that were already balanced out, and are pretty minimal. The bandwidth of the machine is known and the required bandwidth of the job is pretty much constant or has a upper bound previously known.

It would be nice if nomad would keep track of the provisioned bandwidth of a node and only schedule jobs on nodes that do have enough space left. The job is allowed to (temporarily) take up more than the given bandwidth, and nomad shouldn't kill it then (similar to CPU shares).

Nomad taking care about networking bandwidth could prevent these streams from failing because of other jobs that are already constantly using the bandwidth.

Attempted Solutions

This is similar to the deprecated https://developer.hashicorp.com/nomad/docs/upgrade/upgrade-specific#nomad-0-12-0 but not exactly. I don't know if incoming bandwidth would be something useful to add, although I can see the use case for outgoing traffic.

RSWilli avatar Nov 29 '23 16:11 RSWilli

Hi @RSWilli and thanks for raising this issue which seems like a very interesting idea. I am a little unfamiliar with how to understand the total available bandwidth on a machine, do you have any pointers that would allow me to quickly assess this feature a little more? Specifically, Nomad would need a way to fingerprint this value and make it available to the servers as part of the node object, to use when performing scheduling calculations.

nomad shouldn't kill it then (similar to CPU shares)

Nomad does not act as the killer in this scenario, it is the kernel which performs the terminations. It is not feasible to have Nomad clients continually monitor application resource utilisation, therefore any application termination would be dependant on the kernel features. From Nomads perspective, the bandwidth resource would be used in a booking fashion for scheduling purposes.

jrasell avatar Dec 01 '23 07:12 jrasell

Sorry for the (very) late reply.

I am a little unfamiliar with how to understand the total available bandwidth on a machine

In my head it was as easy as "just configure the total bandwidth when starting the nomad server", but I understand that this can become more difficult when a server has multiple interfaces with multiple different bandwidths.

kernel which performs the terminations

The kernel features required to enforce these limits are even further out of my comfort zone...

RSWilli avatar Apr 11 '24 11:04 RSWilli