Inadequate Atomic Guarantees In QTHREAD_FASTLOCK_LOCK
QTHREAD_FASTLOCK_LOCK and the related functions need to be rewritten to use atomic reads and writes instead of relying on the atomicity guarantees from x86. For example, there's currently a race condition between
https://github.com/sandialabs/qthreads/blob/d6ce514a70c65b74c5e04906615ec51c7f288e0f/src/feb.c#L986 and https://github.com/sandialabs/qthreads/blob/d6ce514a70c65b74c5e04906615ec51c7f288e0f/src/qthread.c#L652.
Note: this shows up readily in the hello_world_multi test.
This also shows up inside the hash map when running the syncvar_prodcons test with thread sanitizer on. Specifically:
https://github.com/sandialabs/qthreads/blob/d6ce514a70c65b74c5e04906615ec51c7f288e0f/src/hashmap.c#L412 and https://github.com/sandialabs/qthreads/blob/d6ce514a70c65b74c5e04906615ec51c7f288e0f/src/hashmap.c#L219