config-linux: add support for rsvd hugetlb cgroup
The previous non-rsvd max/limit_in_bytes does not account for reserved huge page memory, making it possible for a process to reserve all the huge page memory, without being able to allocate it (due to hugetlb cgroup page fault accounting restrictions).
In practice this makes it possible to successfully mmap more huge page memory than allowed via the cgroup settings, but when using the memory the process will get a SIGBUS and crash. This is bad for applications trying to mmap at startup (and it succeeds), but the program crashes when starting to use the memory. eg. postgres is doing this by default.
This patch updates and clarifies LinuxResources.HugepageLimits and
LinuxHugepageLimit by defaulting the configurations go to rsvd hugetlb
cgroup (when supported) and fallback to page fault accounting if not
supported.
Fixes https://github.com/opencontainers/runtime-spec/issues/1050
Signed-off-by: Kailun Qin [email protected]
@kailun-qin I'm confused, this patch only seems to include code comments and doc changes?
This (together with runtime implementation) should fix the real issue with some software, described in https://github.com/opencontainers/runtime-spec/issues/1050.
@tianon PTAL
Thanks @kailun-qin!
@kailun-qin @odinuge Do you have a PR for runc? (https://github.com/opencontainers/runc/pull/2360 seems closed)