async
async copied to clipboard
Should thread IDs be stored weakly?
GHC.Conc.Sync
currently says
-- ToDo: data ThreadId = ThreadId (Weak ThreadId#)
-- But since ThreadId# is unlifted, the Weak type must use open
-- type variables.
That seems unlikely to happen, especially now that we have threadStatus
(which needs to be able to determine how a thread died).
However, the condition that comment waits for has arrived—we can now have values of type Weak# ThreadId#
, which are just slightly more efficient than the Weak# ThreadId
values we could have before. No, I don't know why that was considered important. Anyway, the question arises as to whether Async
s should hold their ThreadId#
s weakly. That would mean doing something like
data WeakThreadId = WeakThreadId
{ tid_number :: !Word64
, tid_weak :: Weak# ThreadId#
}
The thread ID number is needed to support compareAsyncs
and the Ord (Async a)
instance.
Making this change would necessitate an API change: the asyncThreadId
function (defined as a field accessor) would have to be removed, in favor of something like
asyncThreadIdMaybe :: Async a -> IO (Maybe ThreadId)
One big question in my mind: garbage collecting Weak#
s can be pretty expensive. Is releasing threads earlier worth that cost?
My guess is it probably isn't worth it. A completed ThreadId
doesn't take much space.
@teofilC points out that in certain cases failing to release threads prevents deadlock detection. I have no idea how that works, so can't comment on the significance. See this comment. BTW, the whole situation around "indefinitely blocked on MVar/STM" seems very complicated and subtle, and I don't even know where to find good documentation. Can you point me in the right direction?
There's a bit of documentation here https://hackage.haskell.org/package/base-4.18.0.0/docs/Control-Concurrent.html#g:14
In particular it's a debugging feature so it doesn't have a well-defined semantics. To have a semantics would mean specifying the garbage collector, the code generator, the simplifier... basically everything, because it depends on exactly which closures contain which references and how the GC traces things.
The emergent behaviour is subtle and surprising, indeed I've been surprised by it myself on many occasions. That's obviously not great, but I also don't have any good ideas for how to make it better, and it is still a useful feature.