libvma
libvma copied to clipboard
Use MADV_HUGEPAGE as ALLOC_TYPE_HUGEPAGES fallback
Subject
Use MADV_HUGEPAGE as ALLOC_TYPE_HUGEPAGES fallback
Issue type
- [ ] Bug report
- [X] Feature request
Configuration:
- Product version: VMA_VERSION: 9.7.2-1
- OS: Alma 8.6
- OFED:
- Hardware:
Actual behavior:
Using huge pages in k8s has a lot of challenges:
- you need to pick an arbitrary number of huge pages (800 ?) even if you don't know how many huge page users will run on your system
- k8s doesn't allow to over-allocate huge pages, if I have 3 pods and I give them 800 huge pages, I need my system to have 3*800 huge pages allocated before my pods start
- k8s uses cgroups hugetlb controller to limit huge pages usage, trying to use more huge pages than allocated gives you a SIGBUS (https://www.kernel.org/doc/html//v5.16/admin-guide/cgroup-v1/hugetlb.html)
- finally enabling huge pages for VMA applications means other applications might try to use huge pages and get killed with SIGBUS (k8s default limit_in_bytes is 0). An example application is pgsql (configurable but still).
When using libvma (librivermax in my case), it would be nice to fallback ALLOC_TYPE_HUGEPAGES using mmap + MADV_HUGEPAGE. Even if has no strict guarantees, with 2Mo huge pages on servers with a lot of memory it's extremely likely you will get huge pages.
Expected behavior:
libvma uses MADV_HUGEPAGE where appropriate to improve performances when explicit huge pages are complicated to setup.