IPC issues
- Right now the IPC
allow()call can succeed, and a notify callback can be triggered, even if theexpose_to()call fails. This may be fixed in 2.0 upgrades. -
client_notify()can be used to determine how many other processes are on the board. By iterating through service IDs the error return values leak which apps are present, even if the apps do not setup any IPC services. - Calling
client_notify()on a different process causes the callee's process to allocate the grant region, consuming application memory the original app didn't intend to. - Currently, scheduling a notify callback may fail for a variety of reasons (callback queue is full, failure to allocate, calling process has died, etc and neither the caller nor the callee are made aware of the failure (see TODO in kernel/src/sched.rs)
- Buffers not power of two in length will cause silent failures on ARM.
- It seems to be the case that App A faulting does not reset the MPU configuration of app B, even if app A shared a buffer with app B. (I haven't verified this experimentally, just tried to read through the fault code -- Hudson)
If we re-implement IPC to use message passing rather than shared memory, we can revisit the MPU implementation (re: #1532).
I was playing with the IPC API and noticed that applications fault due to memory protection if the shared buffers are not multiple of 64. The buffer's memory region does not seem to be mapped.
I cannot figure out the reason for this, all the API calls seem to return success.
If this is a requirement, maybe we should add this to the kernel and refuse the buffer?
While debugging tock/tock#2906 I've noticed a more intricate issue of app load order and, what appears to me as a racy relationship of the IPC service registering its callback and the client notifying the service. This is in the context of a failing rot13_{service,client} CI test. When debugging these issues I used to change the app's order in the TBF list and inserted printf statements. With the rot13_service and rot13_client installed in this order, what seems to be happening in the normal (good) case is:
-
rot13_clientperforms initialization and all of its IPC calls includingipc_notify_servicein one go, which will enqueue a task in the target process' queue -
rot13_serviceperforms initialization and the call toipc_register_service_callback, which will register its callback - the kernel processes the task enqueued in (1) which calls back to
IPC::schedule_upcall, which will ultimately schedule the proper upcall on therot13_serviceprocess - the IPC service callback scheduled in (3) is called by the kernel scheduler
Now, if the kernel were to execute (3) before (2), the IPC task is converted into a callback (FunctionCall) before the service had a chance to subscribe. Then, this function call is dropped because it is a null-callback. Even if it weren't scheduled, as soon as the service subscribes, this pending upcall would be removed from its queue. It appears one can trigger this behavior through a combination of load-order, app and scheduler behavior, but it is most obvious when inserting a simple printf statement at the beginning of ipc_service's main. With the printf statement, if the client is started first, notifying the service will silently fail, whereas if the service is started first, depending on the scheduler behavior, it will likely succeed. Here is an example system call trace of this happening (service prints "Rot13 service!"):
System call trace:
- Process 0:
rot13_service - Process 1:
rot13_client
Verilated LiteX+VexRiscv: initialization complete, entering main loop.
[1] function_call @0x88088(0x88060, 0x40010000, 0x173c, 0x40010000)
[1] memop(0, 0x40010a40) = Success
[1] memop(10, 0x40010800) = Success
[1] memop(11, 0x40010a40) = Success
[1] read-only allow(0x10000, 0, @0x8ae5c, 0x19) = AllowReadOnlySuccess(0x0, 0)
[1] cmd(0x10000, 1, 0x0, 0x0) = SuccessU32(0)
[1] read-only allow(0x10000, 0, @0x0, 0x0) = AllowReadOnlySuccess(0x8ae5c, 25)
[1] remove_pending_upcalls[0x10000:0] = 0 upcall(s) removed
[1] subscribe(0x10000, 0, @0x88108, 0x40010a00) = SubscribeSuccess(0x0, 0)
[1] read-write allow(0x10000, 0, @0x40010a00, 0x40) = AllowReadWriteSuccess(0x0, 0)
[1] cmd(0x10000, 2, 0x0, 0x0) = Success
[1] yield. which: 1
[0] function_call @0x80088(0x80060, 0x40008000, 0x16dc, 0x40008000)
[0] memop(0, 0x400089cc) = Success
[0] memop(10, 0x40008800) = Success
[0] memop(11, 0x400089cc) = Success
[0] memop(1, 0x0) = SuccessU32(1073777100)
[0] memop(1, 0x1b4) = SuccessU32(1073777100)
[0] memop(1, 0x408) = SuccessU32(1073777536)
[0] memop(1, 0x18) = SuccessU32(1073778568)
[0] memop(1, 0x18) = SuccessU32(1073778592)
[0] read-only allow(0x1, 1, @0x40008fa8, 0xf) = AllowReadOnlySuccess(0x0, 0)
[0] remove_pending_upcalls[0x1:1] = 0 upcall(s) removed
[0] subscribe(0x1, 1, @0x81d14, 0x0) = SubscribeSuccess(0x0, 0)
[0] cmd(0x1, 1, 0xf, 0x0) = Success
Rot13 service!
[0] schedule[0x1:1] @0x81d14(0xf, 0x0, 0x0, 0x0) = Ok(())
[0] yield. which: 1
[0] schedule[0x10000:0] @0x0(0x1, 0x40, 0x40010a00, 0x0) = Ok(())
[0] function_call @0x81d14(0xf, 0x0, 0x0, 0x0)
[0] read-only allow(0x10000, 0, @0x81e08, 0x19) = AllowReadOnlySuccess(0x0, 0)
[0] cmd(0x10000, 1, 0x0, 0x0) = SuccessU32(0)
[0] read-only allow(0x10000, 0, @0x0, 0x0) = AllowReadOnlySuccess(0x81e08, 25)
[0] remove_pending_upcalls[0x10000:0] = 0 upcall(s) removed
[0] subscribe(0x10000, 0, @0x80108, 0x0) = SubscribeSuccess(0x0, 0)
[0] yield. which: 1
We currently have no way to tell whether notifying an IPC service (i.e. scheduling a callback in the service process) actually worked. Whereas the source code used to indicate that we might have a method of identifying a null-callback (EOFF), Process::enqueue_task does not actually provide us any information about whether the enqueued IPC task will result in a callback being scheduled or not. This usually makes sense and is a logical consequence of our architecture based around the fact that a null-callback should be treated exactly the same as a proper "do nothing" callback, but here it's not good.
To add something to point (6) of the current IPC issues: restarting an application resets app_break/kernel_memory_break//allow_high_water_mark . The result is that an application (A) can:
- Make a very large malloc(), causing a brk system call to claim all remaining grant/heap memory for itself.
- Allow this to another process (call it B)
- Restart itself in place (i.e., by faulting) (and so reset the watermarks)
- Make syscalls to capsules causing grant allocations that exceed the size of allocation its previous life
This will result in application B having direct access to kernel data structures. Spicy. This can be leveraged even from safe userspace Rust.