xous-core Documenting transmutation of MemoryMessages into data structures

This issue is an attempt to document the hairy process of sharing memory between processes to aid with security audits.

The core issue at stake is the passing of memory between processes. Sending memory from process A to process B is fundamentally (I believe) an unsafe exercise. It's similar to serializing data into network packets or sectors for disk formats -- you're stripping some metadata from a data structure on the sending side, serializing it into what the hardware knows about (mere bytes), and popping it out on the other side as bytes and then trying to wrap a "safe" bow around the whole package again before moving on.

This process needs a good hard look, though, to make sure there are no security exploits.

This is the trajectory that data takes when being shared between two processes:

A structure, with all the wonderful runtime checked bounds and metadata of Rust, exists in a program.
This structure is turned into a xous_ipc::Buffer using the into_buf() call. This does a couple of important things: first, it allocates a new, page-aligned, page-sized chunk of RAM. second, it serializes the data srtucture into that RAM using rkyv.
The Buffer is then shared using a lend, send, or lend_mut call. This call packs the description of the Buffer into a MemoryMessage, which is the OS-native descriptor of some memory to be shared. As of Xous 0.9, by definition, the valid region is actually the entire page, and not just the section of the data that was copied into the page. More on this later.
This is then passed into a syscall which is dispatched in the kernel using a handler like this one for the lend_mut, which formats the incoming data for the SystemServices object to handle (mostly copying fields without checking anything).
The pointy end of the stick is the lend_memory implementation inside SystemServices. This will check that the data to be lent is page-aligned, and occupies a full page. It then proceeds to modify the page mappings so that the page no longer exist in the sender's process space, and are teleported into the receiver's process space.
This returns back into the syscall handler, which then determines the process and thread to switch to, and will ultimately [queue the message[(https://github.com/betrusted-io/xous-core/blob/f62eca9b8819ca6a0de2da8045dd81a6a8eb7e42/kernel/src/syscall.rs#L256) and schedule a quantum of time to the process so that it may handle the incoming message.
The receiving process is now woken up, and its handler determines that a MemoryMessage has arrived, and it does an unsafe cast of the MemoryMessage into a Buffer using from_memory_message, which effectively wraps the mapped pages in a big [u8] slice that provides some upper-bounds checking (the message is not yet turned into a usable data structure, but the message itself is at least contained to within the slice that encodes the extent of the pages of the mapped memory).
The Buffer is transformed into an actual data structure using to_original, which again relies on the rkyv mechanism to deserialize the [u8] slice into a bounds-checked and metadata annotated Rust structure. Note that the to_original call will copy the data out of the borrowed page and into local memory space; as_flat will simply return a pointer into the original mapped data.
Data meant to be returned to the caller is placed back into the Buffer using the replace call, which re-serializes a structure using the rkyv mechanism. Note that there is an unsafe action inside this as well.
Finally, the syscall that remaps borrowed memory (or unmaps sent memory) is taken care of by a Drop implementation on a MemoryMessage. Thus, when the message goes out of scope, the data is automatically mapped back out of the receiver's memory space (or freed), without any need for an explicit return call (this is in contrast to scalar (e.g. passed-by-CPU register) messages where an explicit return call is required).

Note that if a data structure is larger than one page, then the next full page is allocated for the memory being shared. So if a data structure was 4097 bytes in size, 8192 bytes would be allocated and shared.

Here are some observations about the process:

There is no essential enforcement that the type of data sent is the same as the type of data received. The type of memory being sent is defined by the memory that's transformed into a Buffer, and the type of memory inferred from a MemoryMessage is simply the type passed to the to_original or as_flat call on the Buffer. No error is thrown when those types are mismatched.
The above property is sometimes used as a "feature" for returning data between processes. For example, a caller could send a String to a callee, and the callee can return an integer in the same buffer. It is simply up to the two to agree upon that. The mapping process will yield nonsense results and not fail so long as the size of the structures being mapped do not exceed the bounds of the space allocated to the Buffer.
It should be the case that a smaller structure being passed to a callee can be replaced with a larger structure on the return, so long as the size does not exceed that of the nearest page boundary.
Newly allocated Buffers are zerod out by the kernel. The entire page is zeroed out, not just the requested region, because of the check inside lend_memory that requires shared memory to be multiples of a full page even if the target data is smaller than that.

There are several scenarios of concern with respect to the security properties of this mechanism.

An attacker may try to peek at data in the target's memory space by manipulating the length fields of a MemoryMessage
An attacker may try to disclose data to another process without detection by hiding it in the unused space within a mapped page.

I believe that scenario (1) is effectively defeated by the "upper bounds" check placed by the unsafe transformation of a MemoryMessage into a u8-slice. What the attacker wants to do is to trick the target with a length field going into the unsafe transformation that goes beyond the length of memory mapped into the target's memory space, and then somehow trigger a copy of data within that slice into the region that will be mapped back for the exfiltration.

However, the lend_memory implementation in SystemServices checks that (1) the data being mapped into the target is always a full page size and (2) that it is always page aligned and (3) most importantly, it sets the target's length to match the quantum of memory that has been mapped. That, combined with the zeroing of memory during the map, means that an attacker who attempts to extend the length of the data being mapped by tampering with the MemoryRegion data structure before handing to the kernel, will simply end up with mapping more of its memory space into the target's memory space, and there should be no opportunity to overstep that bounds on the part of the target because the length has been set by the kernel and the unsafe creation of the slice from the address and length is thus effectively checked, to a loose bounds, by the lend_memory service.

Scenario (2) is a valid concern. The data being returned to the calling process is not zero'd out. Because the "valid" region is marked as a region that is typically much larger than that of the memory at play, an attacker who controls one process can write data into a page beyond the expected region and pass it off to another colluding process. It does require either a coding error in a target process or control of both processes to effectively exfiltrate data, thus, it not so powerful in and of itself, but is more a primitive that could be very useful in combination with other primitives to create a more powerful exploit.

Thus, probably the memory sharing mechanism could be improved with the following changes:

Tighter bounds on the data being shared
On the re-map of memory to the original lender's space, the kernel should zero out the nominally unused data in the shared page.

It's not clear if and how any sort of type enforcement can even be done in terms of ensuring that data passed by one process is forced to be treated as the sender's type. In the end, a malicious receiver can always decompose a structure with an unsafe operation into a u8-slice and recompose it into another structure, so the enforcement wouldn't have any essential security advantage. One could argue that it'd be a nice feature for cooperating processes to agree on the type of the data being sent to catch coding errors. However, I'm not sure how that would even be done, or if it would be worth the complexity of execution.

Dec 15 '21 16:12 bunnie

@xobs when you get a chance can you please read the above section and let me know if I've made any mistakes in understanding how the process works? I'd also love to hear your thoughts on if the tighter bounds on memory is possible. I tried mucking with the valid specifier in the Buffer type, and things sort of blew up, and wasn't quite sure how things went wrong.

Also maybe once the issue has been reviewed a bit it could be something to copy and paste into the Xous Book, if you feel it's noteworthy.

Dec 15 '21 16:12 bunnie

It is noteworthy, and is a fine description of the higher-level view of how we shove Messages between processes.

Note that Buffer is an implementation detail, and so many of the steps aren't necessarily used in all contexts.

For example, when printing characters to the screen when using println!() in libstd, the function writes into a 4kb local buffer that it repeatedly lends: https://github.com/betrusted-io/rust/blob/482ee3fe6ba67680e58a52d254e15740cd81ca3b/library/std/src/sys/xous/stdio.rs#L56-L64

In this case, the valid field is used to indicate how much data is in the buffer, and the offset field is unused.

Scenario (1) is a way to leak memory of one process into another. If you've managed to gain execute permission in a process, you can overwrite the length field to write additional pages. This is a delicate process because the number of valid lengths is finite: It must be page-aligned and must be valid contiguous memory. Note that this assumes you have an execute primitive already.

Scenario (2) is the more likely one, but I'm not sure how that should be handled. It's entirely possible for you to pass a full struct and only give advisory information on the interesting segment of the memory. For example, you may want to draw a primitive on a bitmap, so you give the region of interest in the valid and offset fields. If you were to zero out memory outside of that region, you would invalidate the rest of the image.

Another example is operating on a stream of tagged data such as the bootloader args. To operate on one specific tag you could pass the offset of the tag you're intetested in, and specify the tag's length using the valid field. In this case, the Server would read the requested data and should not zero out the rest of the structure.

The valid and offset fields were meant to be advisory and untrusted, but perhaps we should get stronger guarantees on what they do.

Dec 15 '21 16:12 xobs

xous-core xous-core copied to clipboard

Documenting transmutation of MemoryMessages into data structures

xous-core
xous-core copied to clipboard