Mmapper granularity: chunk or page?
Implementations of Mmapper keep a map from chunks to MapState values. Obviously it works at the granularity of chunks internally.
But some methods of Mmapper take the number of pages as arguments. Concretely:
-
quarantine_address_range(start, pages, strategy, anno) -
ensure_mapped(start, pages, strategy, anno) -
protect(start, pages)
I don't think Mmapper can maintain MapState at the granularity of pages because that would be too fine-grained. To handle large address ranges up to 48 bits, we would need to switch to a different data structure, such as segment tree which is slower to query than chunk-grained arrays.
If we want Mmapper to keep working at chunk granularity, we should change the public interface in order not to give its user a false impression that it may work at page granularity. We should remove all mentions of "page" and use one of the following strategies:
-
quarantine_address_ranges(start: Address, bytes: usize, ...)while requiring bothstartandbytesto be chunk-aligned. -
quarantine_address_ranges(start_chunk_index: usize, num_chunks: usize, ...)where bothstart_chunk_indexandnum_chunksare chunk indices counted from address 0. That is,chunk_index = address >> LOG_BYTES_IN_CHUNK. -
quarantine_address_ranges(chunk_range: ChunkRange, ...)wherechunk_rangeis a dedicated data structure that refers to whole chunks.
I personally prefer method 3. Regardless of the internal representation of ChunkRange (whether it uses aligned addresses or chunk indices), there is no way the user can supply an unaligned range.
This might not be an issue -- a mmapper guarantees the required pages to be mmapped, but it internally mmaps at the granularity of chunks.
we should change the public interface in order not to give its user a false impression that it may work at page granularity
The users should not assume how the memory is mmapped -- they just know the pages they require will be mapped.
This might not be an issue -- a mmapper guarantees the required pages to be mmapped, but it internally mmaps at the granularity of chunks.
we should change the public interface in order not to give its user a false impression that it may work at page granularity
The users should not assume how the memory is mmapped -- they just know the pages they require will be mapped.
This is not about "guarantee", and is not about "assuming how memory is mmapped". It is about the granularity of control. Suppose if the user wants one chunk to be mapped, and a page in the chunk to be protected. The user would make the following two calls:
mmapper.ensure_mapped(start, PAGES_IN_CHUNK); // One chunk.
mmapper.protect(start, 1); // One page.
As a result, the whole chunk will be protected. We may argue that mmapper.protect(start, 1) "guarantees" the page is protected, but it also has the side effect of making all pages in the chunk inaccessible. An interface like this is very misleading because it conveys wrong message to the user.
Suppose if the user wants one chunk to be mapped, and a page in the chunk to be protected.
The user is PageResource in which case they care about pages, not chunks. Mmap chunks is internal to mmapper.
As a result, the whole chunk will be protected. We may argue that mmapper.protect(start, 1) "guarantees" the page is protected
MMTk in JikesRVM specifically states that the granularity of protect is chunk, rather than page. Maybe we should add the same note as in https://github.com/JikesRVM/JikesRVM/blob/5072f19761115d987b6ee162f49a03522d36c697/MMTk/src/org/mmtk/utility/heap/layout/ByteMapMmapper.java#L183.
The map state transition in JikesRVM is
UNMAPPED ------> MAPPED <------> PROTECTED
The transition from UNMAPPED to MAPPED happens monotonically for most spaces when PageResource allocates more pages. Since MMTk assigns discontiguous memory to Space at Chunk granularity, this will work fine because if a space owns one page, it implies that it owns the chunk. The same is true for contiguous spaces.
JikesRVM can protect the memory released by CopySpace. Since CopySpace only ever releases the whole space, the memory it releases must be whole chunks. So that worked, too.
The map state transition in Rust MMTk is
Unmapped -------> Quarantined -------> Mapped
Unmapped -------> Mapped
Metadata go through a Quarantined state, where data go directly to Mapped. But it doesn't matter. As long as the state goes monotonically to Mapped, it shall be fine to work at chunk granularity.
MMTk in JikesRVM specifically states that the granularity of protect is chunk, rather than page. Maybe we should add the same note as in https://github.com/JikesRVM/JikesRVM/blob/5072f19761115d987b6ee162f49a03522d36c697/MMTk/src/org/mmtk/utility/heap/layout/ByteMapMmapper.java#L183.
If we want to use protect in the Rust MMTk, we should definitely add such notes. But I still prefer explicitly telling the user of Mmapper that all methods work at chunk granularity, or even enforcing it using the ChunkRange data type. I think JikesRVM is "lucky" because all of its use cases happen to make use of the chunk granularity in the right way. If MMTk keeps evolving, implicit assumptions will be harmful.