feat: add config option to enable transparent huge pages
Changes
Add a new machine-config field, enable_thp, that controls whether transparent huge pages (THP) is enabled for the microVM.
Reason
Currently, even for a VM that is not restored from a snapshot and not using any vhost-user devices, Firecracker does not attempt to enable transparent huge pages for it. This makes it necessary to reserve hugetlb pages on hypervisor host machines, making things hard on hosts that run mixed workloads.
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.
PR Checklist
- [ ] I have read and understand CONTRIBUTING.md.
- [ ] I have run
tools/devtool checkbuild --allto verify that the PR passes build checks on all supported architectures. - [ ] I have run
tools/devtool checkstyleto verify that the PR passes the automated style checks. - [ ] I have described what is done in these changes, why they are needed, and how they are solving the problem in a clear and encompassing way.
- [ ] I have updated any relevant documentation (both in code and in the docs) in the PR.
- [ ] I have mentioned all user-facing changes in
CHANGELOG.md. - [ ] If a specific issue led to this PR, this PR closes the issue.
- [ ] When making API changes, I have followed the Runbook for Firecracker API changes.
- [ ] I have tested all new and changed functionalities in unit tests and/or integration tests.
- [ ] I have linked an issue to every new
TODO.
- [ ] This functionality cannot be added in
rust-vmm.
Hi @losfair ,
Thanks a lot for your contribution and sorry for late reply.
We are definitely interested in the possibility to support THP. We attempted to include support when we added support for huge pages but at the time we actually realized that there were incompatibility with UFFD which is also the most common feature used with Firecracker therefore the de facto will be a dead feature. Would be possible to do some more researches on how we can add THP pages support while working with UFFD, maybe using UFFDIO_MOVE or maybe making it work with UFFDIO_COPY?
UFFDIO_COPYexplicitly rejects THP pages. See https://github.com/torvalds/linux/blob/23cb64fb76257309e396ea4cec8396d4a1dbae68/mm/userfaultfd.c#L793-L802UFFDIO_MOVEworks within the same process, but moving pages between processes is not allowed. See https://github.com/torvalds/linux/blob/23cb64fb76257309e396ea4cec8396d4a1dbae68/fs/userfaultfd.c#L1906-L1908
However, it's possible to make UFFD work with THP enabled for guest memory:
- Snapshot restore uses UFFDIO_COPY (base page granularity)
- After restore, THPs can form via khugepaged or explicit madvise(MADV_COLLAPSE)
- Tradeoff: THP benefits are delayed until pages are collapsed (not immediate after restore)
@xmarcalx would this gradual THP formation be acceptable, or do you need THPs immediately after restore?
Hi @piscisaureus, thank you for the suggestion, and apologies for the late reply.
We need to discuss internally our position on this. Immediately, one potential concern I have is if we register the UFFD to receive UFFD_REMOVE events, with the handling desired behavior being zeroing pages on removal (i.e. UFFDIO_ZEROPAGE). I believe this will fail, as it uses the same mfill_atomic function used by UFFDIO_COPY.
Regardless, we will have a chat to determine where we stand with your proposed implementation. We'll likely get back to you in early January, as a lot of people will be away for the holidays later this month. Cheers.