Explicit Atomics Needed In FEB Implementation
The current implementation of FEB synchronization depends on the guarantees of x86 loads and stores. For example, when running the hello_world_multi test with thread sanitizer enabled numerous data races show up. One example:
https://github.com/sandialabs/qthreads/blob/d6ce514a70c65b74c5e04906615ec51c7f288e0f/src/feb.c#L470
https://github.com/sandialabs/qthreads/blob/d6ce514a70c65b74c5e04906615ec51c7f288e0f/src/feb.c#L1258
Another example in aligned_writeFF_basic: https://github.com/sandialabs/qthreads/blob/d6ce514a70c65b74c5e04906615ec51c7f288e0f/src/feb.c#L1155 can be run by different threads simultaneously and result in a racy write to the same address regardless of what we're doing with memory barriers.
Alright, the hello_world_multi fail seems to have been resolved by a different PR. The writeFF test is deliberately creating a data race so we're planning to suppress that one. Closing this.
Reopening this one. I missed a few things that need to be atomic because there are cases that only get hit with the distrib and sherwood schedulers.
Examples Read: https://github.com/sandialabs/qthreads/blob/85c5389c4a89707515ea8a6a83eaf4defe7e6927/src/qthread.c#L493 Write: https://github.com/sandialabs/qthreads/blob/85c5389c4a89707515ea8a6a83eaf4defe7e6927/src/feb.c#L177
This one seems like something with insufficient memory fences for FEBS. There was probably some synchronization inside the nemesis scheduler that was covering it up. Read: https://github.com/sandialabs/qthreads/blob/85c5389c4a89707515ea8a6a83eaf4defe7e6927/test/basics/hello_world.c#L39 Write: https://github.com/sandialabs/qthreads/blob/85c5389c4a89707515ea8a6a83eaf4defe7e6927/src/feb.c#L1021
I think these were all fixed by https://github.com/sandialabs/qthreads/pull/250, though the Sherwood scheduler hasn't been fully cleared of sanitizer errors yet.