mono icon indicating copy to clipboard operation
mono copied to clipboard

Add checks for OS/coop lock mixing to checked builds

Open CoffeeFlux opened this issue 6 years ago • 2 comments

Currently, we have a mix of OS and coop-aware locks used throughout the runtime, which can lead to deadlocks. The scenario where that happens is described here, the important parts being:

On second thought, OS locks are fine.... provided you can prove there's no safepoint while you hold the lock

What's bad is if we have a lock hierarchy with an OS lock taken before a coop lock

This seems reasonable to ensure in checked builds, and could potentially rid us of a class of rare, inconsistent deadlocks with no performance implications for normal builds.

CoffeeFlux avatar Feb 18 '20 18:02 CoffeeFlux

I gave it a try in https://github.com/mono/mono/pull/18995 by instrumenting OS locks to use the coop state machine's "no safepoints" bit to flag when an OS lock is locked. The code ended up being a bit intrusive because of recursive mutexes and the need to keep track of which call to lock is responsible for flipping the state bit.

In the end we ended up not merging the code, but we could revisit in the future

lambdageek avatar Mar 06 '20 17:03 lambdageek

If we believe that specifically "locked OS lock ; locked coop lock" is a bad combination that is what we want to protect against, we could try another approach.

Avoid the coop state machine and use a TLS counter when taking an OS lock, and check whether it is > 0 when taking a coop lock.

We wouldn't need a cookie in the OS lock for recursive locking - we could just increment the TLS counter. There will still be complications with condition variables, fast paths via trylock, however, but those are not as bad.

lambdageek avatar Mar 06 '20 17:03 lambdageek