alpaka icon indicating copy to clipboard operation
alpaka copied to clipboard

atomic_ref based atomics are too strong

Open bernhardmgruber opened this issue 1 year ago • 2 comments

The CPU atomic implementation using std::atomic_ref use a sequentially consistent memory ordering, which is a stronger guarantee than their CUDA counterparts, which are weakly ordered and always require explicit fences. Therefore, the CPU atomics should also be weakened to a relaxed memory order, potentially improving performance on CPUs.

bernhardmgruber avatar Sep 19 '23 17:09 bernhardmgruber

I think on x86 they are the same, IIRC it is only ARM and Power that have weaker atomics.

fwyzard avatar Sep 19 '23 20:09 fwyzard

From Hans Boehm's talk at CppCon:

image

We can still avoid a fence on x86 with weakly ordered atomics.

bernhardmgruber avatar Sep 20 '23 08:09 bernhardmgruber