copilot Concurrency considerations in generated code

The generated C code takes some care to copy the values of all the "external" streams from shared global variables to private globals before beginning the remainder of its work. This is intended as a defense against concurrent or interrupt-handling code modifying the variables. According to the Copilot 3 manual:

Copying of the external variables is necessary to make the code reentrant: it can continue
its original execution after being interrupted. If this was not done, the execution of a monitor
might be interrupted and resumed after variables used by the monitor have been updated.
This can cause the same conceptual value to be different within one execution step, breaking
the assumption of causality, which is disastrous from the point of view of reliability.

This copying step is necessary, but not sufficient. At a minimum, the external global variables should be marked volatile, which indicates to the compiler that loads and stores involving these variables must not be hoisted, omitted or delayed. Probably these initial copies should also (optionally) be wrapped in a critical section to allow operating environments to mask interrupts or take a mutex while the copies are occurring.

I suggest we add an option to CSettings, which, when set, will wrap this initial copying process in functions such as step_start_critical_section() and step_end_critical_section() which are intended to be implemented by the environment to take whatever steps might necessary to ensure reliable operation.

Sep 21 '21 18:09 robdockins

@robdockins Frank is going to take a look at this. It may not make it into the very next release (3.6, to be frozen in 1 week), but I'm guessing we'll have dealt with this for good by the one release after the next one (3.7).

Oct 01 '21 00:10 ivanperez-keera

@fdedden Frank, were you ever able to look into this?

I wonder if the compiler is adding any optimizations when volatile is not present. I mean, the variable is copied before it is used. Unless it inlines the copy such that it happens inside a call to the verification function that will actually use the value, I don't see what else could happen.

It's not an unreasonable requirement of Copilot's API that all input variables should remain untouched until the call to step() finishes. Under that assumption (which is not yet there), I don't see how volatile could change anything.

It would be awesome to see an example of code where not having volatile makes a difference.

Jan 13 '22 13:01 ivanperez-keera

For my part, I'm more worried about the external interface code that links against the generated code, as opposed to the generated code itself. However, on low optimization in particular, I'd be surprised to see the compiler do anything untoward given that the globals are declared external, so perhaps the volatile keyword isn't necessary after all. I'd have to read the C standard pretty carefully to figure out if it makes a difference in this case.

On the other hand, I think the additional stubs for mutual exclusion are pretty important for systems where it matters.

Jan 13 '22 19:01 robdockins

I've been thinking about this a bit more.

It is clear to me that, if the compiler interleaves checks of monitors with copying inputs, then the meaning of the expressions monitored could change (handlers could change them).

So I agree that addressing this might be a good idea, even if we have yet to find an actual example that breaks this.

I'm not sure if having Copilot call functions like step_start_critical_section() is the way to go. After all, Copilot is only executed by explicit demand from a caller, so the point where step_start_critical_section() would be called would be immediately after the user executes step(). Assuming that calls to handlers and copying of data is not interleaved by the compiler, currently, a user could even do:

spec = do
  trigger "step_end_critical_section" true []
  <...other triggers...>

That could call the function step_end_critical_section before any monitors are executed but after any copying. (*Technically, in Copilot triggers are executed in inverse order and that should change IMO, but the point remains.)

So maybe volatile is the only thing we "need" after all.

To make an approach like the one in that code snippet not a hack, the order of execution of monitors should be properly documented (it should be mentioned in Copilot's operational semantics), the way to produce this effect should also be documented (tutorial?), and might even be abstracted away.

Apr 15 '22 22:04 ivanperez-keera

Agreed that the calling context could ensure that mutual exclusion is enabled before calling step. If the order of the triggers was reliable and documented, I also agree that a trigger function could be used to end the critical section. Overall, I think the resulting system would be more fragile than if there were dedicated calls, but it could indeed work.

In particular, I would argue that an explicit end_critical_section call is better than using a special trigger function to achieve the same effect from an aesthetic and separation-of-concerns point of view. I also think it's also better from a verification point of view. If we wanted to verify that the generated C code only accesses the volatile globals during the critical section, then we have to communicate to the verifier what the extent of that critical section is; using a special trigger function for the purpose complicates the story somewhat.

A thought just occurred to me. Another option would be to split the step function into two pieces; one that just does the copying step, and another that does the state updates and calls trigger functions. The the calling environment could ensure mutual exclusion however it likes during the first "prestep" call. I don't know offhand if this has any particular advantages.

Apr 15 '22 23:04 robdockins

[...] we have to communicate to the verifier what the extent of that critical section is

I agree. It may be better to have a dedicated construct for this.

Another option would be to split the step function into two pieces

I think this complicates things for users. Now the contract requires that they call two functions instead of one.

At this point, my main thought is to try to keep the complexity as low as reasonably possible, both in Copilot's API, in the generated code, and in the implementation.

Imagining that we execute a call to end_critical_section after copying, I'm wondering if there's a simple, clean way to let users customize that if needed, but no require it. Perhaps we can use weak symbols for that. That would keep all existing user code working, require no changes in the API, and not much additional logic in Copilot (and the logic would have no branching or conditions). So long as users can define the end_critical_section to do what they want (without touching the generated code), and your verifier can deal with it, maybe it's a simple enough solution.

Apr 15 '22 23:04 ivanperez-keera

(I said the above without even knowing if this would be generally understood by C99-compliant compilers.)

Apr 15 '22 23:04 ivanperez-keera

Ok, think of using the following:

Main program (test.c):

void end_critical_section() __attribute__((weak));

int main()
{
  if (end_critical_section) end_critical_section();
}

Auxiliary module (aux.c):

#include <stdio.h>

void end_critical_section()
{
  printf("Hello world\n");
}

If we only link test.c, it does not execute the auxiliary function:

$ gcc test.c -o main
$ ./main
$

If we link it all together, then it does:

$ gcc test.c aux.c -o main 
$ ./main 
Hello world

I just checked with GCC and it seems to accept this with -std=c99 enabled.

Thoughts? How does it influence your verification efforts?

EDIT: Cleaning; typos

Apr 15 '22 23:04 ivanperez-keera

That's an interesting idea, and I think it could work. I don't know that we have experimented with how weak symbols interact with our verification setup. I'll try it out and see what happens.

Apr 18 '22 15:04 robdockins

A little reading leads me to believe that weak symbols are not a truly portable feature, but are relatively well supported by GCC and clang. There do seem to be quite a few caveats and some differences in interpretation by system linkers. I'll leave it up to you if that is sufficient for your purposes.

Apr 18 '22 16:04 robdockins

We had a chat about this as well just now.

One of the things I dislike about my proposal of using weak symbols is that the user gets no compile-time indication of whether their code is being picked up and will be called as intended. If they make a mistake (e.g., typo in the function name end_critical_section), they'll get no warning or error from the compiler. (It's a double edged sword, they don't have to define it, but they lose static assurance.)

Apr 18 '22 18:04 ivanperez-keera

Agreed, weak symbols are less fool-proof.

Apr 18 '22 21:04 robdockins

copilot copilot copied to clipboard

Concurrency considerations in generated code

copilot
copilot copied to clipboard