hydra icon indicating copy to clipboard operation
hydra copied to clipboard

Why is machine loading rounded?

Open expipiplus1 opened this issue 8 years ago • 4 comments

https://github.com/NixOS/hydra/blob/master/src/hydra-queue-runner/dispatcher.cc#L131-L132

In the comparison for machine selection:

float ta = std::round(a.currentJobs / a.machine->speedFactor);
float tb = std::round(b.currentJobs / b.machine->speedFactor);

This doesn't seem necessary, and leads to incorrect results when the machines have a high speedFactor.

expipiplus1 avatar Jun 05 '17 13:06 expipiplus1

@domenkozar might have figured it out:

14:48 <domenkozar> jophish: yes, one side effect of the roundf is that same machine will be reused until speedfactor is reached
14:48 <domenkozar> so it might have a consequence of less S3 downloads

expipiplus1 avatar Jun 05 '17 14:06 expipiplus1

Yes, IIRC that was the reason.

(Looks like the C++ rewrite of build-remote got rid of the rounding BTW, which may be by accident.)

edolstra avatar Jun 06 '17 09:06 edolstra

Would it be more appropriate to divide with max jobs?

I've patched our local hydra to be a bit fairer in case anyone's interested :)

https://github.com/expipiplus1/hydra/commit/73e835b2aeea563994df8c4853c361752105f109

expipiplus1 avatar Jun 06 '17 09:06 expipiplus1

I have simply removed the std::round on my local Hydra and it has been working great for me for a couple of months now. The workload is distributed much better now.

That said, my build machines don't do many S3 downloads as I compile everything from source, so I don't have to worry about that.

someplaceguy avatar Oct 11 '23 17:10 someplaceguy

It is back to being floats!

Ericson2314 avatar May 23 '24 14:05 Ericson2314

Would it be more appropriate to divide with max jobs?

I've patched our local hydra to be a bit fairer in case anyone's interested :)

expipiplus1@73e835b

This is interesting and perhaps should be pursued anyways.

Ericson2314 avatar May 23 '24 14:05 Ericson2314

It is back to being floats!

I don't understand. Wasn't this issue about the rounding, which causes the load to be distributed unevenly across the remote machines?

Based on the current code:

https://github.com/NixOS/hydra/blob/b3e0d9a8b78d55e5fea394839524f5a24d694230/src/hydra-queue-runner/dispatcher.cc#L234-L235

... isn't rounding still being performed, or what am I missing?

someplaceguy avatar May 24 '24 11:05 someplaceguy