bzero pages
The kernel must bzero every page that it gives to userspace, to prevent leaking data from a process to another.
Unfortunately this implies a lot of overhead. We should find a way to be smart about it, and do it the less often possible.
First we need to decide if either:
- All allocated pages are bzero'd, no matter if it's for landing them to user, or for kernel internal use.
- Only userspace pages are bzero'd before use. This means that the kernel can re-use old dirty userspace pages for its own use, which could facilitate kernel exploits. However since the kernel is supposed to be so small, and all services reside in userspace, this might be acceptable.
Secondly, we need to find a strategy to keep track of dirty pages, and bzero them only once.
I imagine we can put to use the dirty flag in the page tables, but i don't really know when would be the best moment to do the clean. On allocation ? On clean ?
Also: Do we consider never-used pages as clean ? Are they guaranteed to be all zeros on boot ?
A lot of questions 😕
We should at least clean up on deallocation
Can probably be closed with https://github.com/sunriseos/SunriseOS/blob/master/shell/img/meme7.gif
Given the number of "reboot and use some state left in RAM" attacks on various systems, I think that you cannot assume that never-used pages are clean.
There's a trade-off between whether you want allocation to be fast, or freeing to be fast. If you zero on allocation:
- You don't have to clean never-used pages until they are actually used.
- On the other hand, this may mean that stuff sticks around for a long time in memory, even if inaccessible.
- Free (shouldn't) have to do anything extra.
If you zero on free:
- Never-used pages should be cleared at boot.
- Attacker controlled values go away as soon as the page is freed.
- As a side-effect, I'd think this accomplishes 'zero memory on shutdown' kinda automatically
- Allocation (shouldn't) have to do anything extra.
- All kernel pages are zeroed too, because there are no free dirty pages.
Upon thinking about this, I think I actually prefer zeroing on free due to the implications it has. Either a page is used and has valid(?) data, or is zero.
One unmentioned problem is that to bzero a page, it needs to be mapped. However the allocation of pages and their mapping is highly decoupled (frame_allocator vs. paging). This means we either have to:
- have the frame_allocator re-map a frame every time it needs to bzero it (either at allocation or at free). This is a lot of overhead, and clearly not ideal.
- map the whole RAM in our address space somewhere in the KernelLand, so that frame_allocator can bzero freed frames by accessing them directly. This is the preferred solution for 64-bit address spaces, however this is not applicable for 32-bit address spaces, since a 4GB address space isn't big enough.
- find a way to re-couple the frame_allocator and paging on this matter. If we end up going for the 'zero on free' solution, we should use the 'dirty' bit in the page tables to skip pages that haven't been modified.