synchronization-benchmarks icon indicating copy to clipboard operation
synchronization-benchmarks copied to clipboard

Use proper CAS instructions

Open apellegr opened this issue 2 years ago • 3 comments

In include/atomics.h, shouldn't this:

        asm volatile(
        "       mov     %[old], %[exp]\n"
        "       cas   %[old], %[val], %[ptr]\n"
        : [old] "=&r" (old), [ptr] "+Q" (*(unsigned long *)ptr)
        : [exp] "Lr" (exp), [val] "r" (val)
        : );

Be:

        asm volatile(
        "       mov     %[old], %[exp]\n"
        "       casal   %[old], %[val], %[ptr]\n"
        : [old] "=&r" (old), [ptr] "+Q" (*(unsigned long *)ptr)
        : [exp] "Lr" (exp), [val] "r" (val)
        : );

apellegr avatar Dec 21 '23 20:12 apellegr

cas64_acquire_release uses this version of cas. Is there a reason to not use this where the al version is needed?

lucasclucasdo avatar Dec 22 '23 04:12 lucasclucasdo

Fair enough. So why are we not using the cas_aquire_release on the cas_ref lock code (eg line 46 and 64), instead of the CAs without acquire and release semantics? val = cas64(lock, val, old);

apellegr avatar Dec 22 '23 04:12 apellegr

cas_lockref is a simplified representation of Linux kernel lockrefs which uses cmpxchg64_relaxed that boils down to a barrierless cas on AArch64 (I believe)

This is a lockable refcount that is mostly only used as a refcount that is infrequently locked and so doesn’t usually need barriers for correct behavior. When locking is actually performed the code falls back on spinlock wrapped modifications to ensure correct ordering. cas_lockref is only trying to increment and decrement the lockable refcount, not lock it.

lucasclucasdo avatar Dec 22 '23 04:12 lucasclucasdo