l4v
l4v copied to clipboard
MCS replies from kernel need a thorough review
The PR seL4/seL4#243 fixes an issue in PageGetAddr
and SchedContextYield
+ SchedContextConsumed
. The underlying problem in that issue is that in the (already verified) PageGetAddr
we do not use the spec convention for passing back a reply from the user to the kernel.
That reply-from-kernel convention in the spec (and Haskell) is that the the perform*
functions return a list, which contains the reply, and handleInvocation
then sets message info etc correspondingly and also handles the thread state correctly.
Whoever wrote PageGetAddr
in C didn't know about that protocol and got the thread state update wrong. Whoever did the spec+proof update didn't know the protocol either and modelled explicit setting of message registers and message info (IPC buffer etc). In X64 you can see an attempt to get the thread state, presumably to later set it, which then probably didn't work in CRefine. In any case, we end up with quite a bit of effort in the spec to get to the result that both spec and C are wrong (correctly modelling each other). I'm working on fixing this for PageGetAddr
, which is reasonably small and should also make the proofs a bit nicer.
Why does this affect MCS: in looking at the fix for SchedContextYield
+ SchedContextConsumed
I realised that the protocol for replies to the user is completely broken/absent in most new MCS calls. This is pretty bad. It likely means that we have bugs for correct thread state setting, which the proofs will not find, because there is no property that enforces this, only that protocol. As a user, you only notice these if you are actually trying to read the reply from the kernel, which apparently does not happen that often (nobody noticed it until that PR above).
So. I think what we need to do is look at this fairly deeply in both C and all spec levels and refactor the code to actually follow the reply-from-kernel protocol. This is going to be a bit hairy, because MCS sets message registers and IPC buffer content often fairly deep in the function call graph. For instance in setConsumed
in schedContextCompleteYieldTo
. Basically, any time an IPC buffer pointer is passed around in a call that is not an IPC is a red flag for this protocol, and there are quite a few of them. Another complication is that a lot of this reply message setting is conditional, i.e. mostly that list that the top-level invocation should return is empty, but sometimes it is not.
I don't think we can leave it as is. I also don't think we can easily move the generation of the messages to later points higher in the call graph. Not entirely sure what a good solution is. Passing the message (which is usually one word) back up the call graph is fine in the spec, but could become awkward and unidiomatic in C, where it does make more sense to just write to the buffer directly. But this brings it out of sync with the spec if the spec follows the protocol (which in turn is the only thing that guards against the thread state bugs we see in the PR).
As a user, you only notice these if you are actually trying to read the reply from the kernel, which apparently does not happen that often (nobody noticed it until that PR above).
The libsel4 invocation stubs do not check the message info register at all, they just assume everything worked, so I'm not surprised it hasn't been noticed before.
The libsel4 invocation stubs do not check the message info register at all, they just assume everything worked, so I'm not surprised it hasn't been noticed before.
I do remember a similar such case a few years ago and was wondering why it took so long to notice it. That explains it.