Antti Kervinen
Antti Kervinen
Yes. I did not see any effect. There were still too many threads and CPU affinity errors. (Even in the case that there would be a workaround that drops the...
> Small nit, I'd suggest to use `MemoryPolicy` and `memoryPolicy` instead of `mempolicy`. It would be more readable IMHO. Thanks @kad, fixed. Definitely better.
@OlivierDehaene, @Narsil, @sywangyi, would you have time to check out this PR and related issue, please?
> > CPU affinity implementation (since v2.3.0 until current HEAD ([4b8cda6](https://github.com/huggingface/text-generation-inference/commit/4b8cda684b45b799de01a65e3fe3422a34a621d3)) ignores already existing CPU pinning for the process. > > The indicated commit is HEAD from 4 days ago,...
In token throughput performance point of view, in a two-socket system this issue _prevents_ gaining 2x throughput, or in a four-socket system gaining 4x throughput, that would be both achieved...
Two nits, otherwise looks good.
Looking at the governor solution in high level (first implementation and this patch), I'm starting to think that we are adding unnecessary complexity. And I'm slightly worried about it, as...
Marked ready-for-review after successful validation that this works in environments strictly behind proxies and without any proxies.
If a vagrant vm is already running after previous tests, these changes speed-up the test start up time: from launching "run_tests.sh" until finishing "TASK [copy helm charts]": - without patches:...
@cyphar, when I was testing https://github.com/opencontainers/runtime-tools/pull/786, I noticed that leaving out "omitempty" from LinuxMemoryPolicy.Nodes causes runtime-tools validation `checkMandatoryUnit()` to report an error if this field contains an empty string. And...