ekg-core Atomic C functions are not atomic.

hs_atomic_read and hs_atomic_write aren't atomic at all on machines without strong memory models (total read and store ordering). To see an example of the kind of havoc this bad assumption can cause, see this (thankfully now fixed) GHC issue https://gitlab.haskell.org/ghc/ghc/-/issues/15449

Here's the aarch64 code that GCC 8.2 yields for these functions: https://godbolt.org/z/giVdtx No barriers or acq/rel lda/sta emitted...

I'm curious why C functions are used for these atomic operations at all. Are these functions really that much faster than atomicModifyIORef?

Jun 25 '20 16:06 TravisWhitaker

The Atomic values that we operate on with these C functions are in a ForeignPtr https://github.com/tibbe/ekg-core/blob/f8d26b9a3806125694c25b9459ec0c4a0b2e87ce/Data/Atomic.hs#L22

Since IORef is just a MutVar#, using these C functions might even be slower than an IORef.

I'd be happy to take a crack at moving over to IORef, but perhaps there's something I'm missing (or I'm standing on the grave of some really tricky bug...).

Jun 25 '20 16:06 TravisWhitaker

IIRC I measured it and it was much faster. In particular a MutVar# update implies allocation for the heap value it refers to.

Jun 25 '20 21:06 tibbe

Yikes, for some reason I thought that a MutVar's contents would be unboxed if the contained type was unboxed, but it seems that's not the case: https://gitlab.haskell.org/ghc/ghc/-/blob/master/includes/rts/storage/Closures.h#L181

MutableByteArray# has the required atomic operations, though. So we could sprinkle __sync__synchronize() around, or give MutableByteArray# a spin.

Jun 25 '20 23:06 TravisWhitaker

@tibbe is there any interest in fixing this bug (via #42 or otherwise)? As-is, ekg is totally broken on aarch64.

CC @23Skidoo

Mar 24 '21 23:03 TravisWhitaker

I'll look into cutting new releases during the weekend.

Mar 25 '21 07:03 23Skidoo

I’d definitely like to get more eyes on #42 before it lands in a release. I’ve been using it since I opened the PR, but my use cases might not hit certain issues.

Mar 25 '21 17:03 TravisWhitaker