haskell-lockfree icon indicating copy to clipboard operation
haskell-lockfree copied to clipboard

CAS fails with newly created ticket

Open Lysxia opened this issue 7 years ago • 8 comments

Here I create a new IORef, get a ticket and immediately compare-and-swap. I expect that to succeed and thus the resulting boolean to be True, but it is actually False, regardless of optimization levels. The following features seem relevant, and removing any of them results in the output being True again:

  • the newtype T
  • binding the initial constant (to be put in the IORef) as an INLINE variable (surprisingly, either removing the INLINE pragma or inlining zero by hand make the result True)
  • forcing the ticket before the CAS.

What is wrong here?

import Data.IORef
import Data.Atomics

newtype T = T Int

zero :: T
zero = T 0
{-# INLINE zero #-}

main = do
  x <- newIORef zero
  ticket <- readForCAS x
  (b, _) <- ticket `seq` casIORef x ticket (T 1)
  print b

Lysxia avatar Jan 21 '18 20:01 Lysxia

I noticed it returns False for both ghc 8.0 and 8.6, and latest atomic-primops. Didn't try others.

jberryman avatar Aug 11 '19 23:08 jberryman

Also, maybe https://github.com/rrnewton/haskell-lockfree/issues/53 ?

jberryman avatar Aug 11 '19 23:08 jberryman

Forcing a ticket really never makes sense the way tickets are defined. The strictness analyzer can get hold of this and say

  T (I# tick) <- readForCAS x
  (b, _) <- casIORef x (I# tick) (T 1)

All this unsafeCoerce# business is very flaky. It makes much more sense to use lazy (see #79). The only way I can see to avoid trouble when someone forces tickets is to make Ticket a datatype instead of a newtype, but then users had better be careful to be strict enough to make sure they unbox, or risk CAS failures because of allocation and so on. Sigh.

treeowl avatar Sep 22 '20 20:09 treeowl

I'm really quite convinced that making Ticket a datatype is the correct solution. I've opened #86 to do that. Yes, if things don't unbox that creates a risk of CAS failing, but even that won't be nearly as catastrophic as how they can fail today.

treeowl avatar Mar 03 '23 05:03 treeowl

Yes, if things don't unbox that creates a risk of CAS failing

Does that mean that something like atomicModifyIORefCAS Might Loop forever?

jberryman avatar Mar 13 '23 14:03 jberryman

@jberryman That's always a risk in a lock-free context—there's no fairness guarantee. But in practice it shouldn't. Even if code takes a little longer than it should between read and CAS, it should eventually go through.

treeowl avatar Mar 13 '23 16:03 treeowl

But I interpreted what you wrote to mean that if atomicModifyIORefCAS say is implemented in such a way that Ticket is not unboxed, then it would always loop forever as the CAS can never succeed, in other words would be a completely broken implementation. is that right?

jberryman avatar Mar 13 '23 17:03 jberryman

No, that's not right. The only risk if the ticket is not unboxed is that there will be a longer delay between the read and the CAS, since a (two word) heap object will be allocated and the read value stored in and then retrieved from it. In a high-contention environment, that could cause excessive CAS misses, but it shouldn't cause permanent failure.

treeowl avatar Mar 13 '23 18:03 treeowl