eio icon indicating copy to clipboard operation
eio copied to clipboard

Make Waiters lock-free

Open talex5 opened this issue 3 years ago • 3 comments

Waiters currently needs a mutex to protect the list of waiting fibers. This prevents us from using it in signal handlers or GC finalizers (see #374). We can't use Lf_queue here because we need to support cancellation (removing waiters from the middle of the queue).

https://arxiv.org/pdf/2111.12682.pdf ("A formally-verified framework for fair synchronization in Kotlin coroutines") could be a good solution, and will probably be faster too.

talex5 avatar Dec 05 '22 17:12 talex5

Instead of making it lock-free, I added a separate Cells module for lock-free users. Promise, Condition, Semaphore and 0-capacity Streams now use that.

The remaining uses are:

  • [x] Switch (which could probably use Single_waiter instead).
  • [ ] Mutex (should be easy to update to Cells).
  • [ ] Streams with non-0-capacity (non-trivial, but should be possible to switch).

talex5 avatar Feb 02 '23 12:02 talex5

Looking at the Mutex API, the protect argument of use_rw seems unnecessary.

-Mutex.use_rw ~protect:false mutex thunk
+Mutex.use_rw mutex thunk
-Mutex.use_rw ~protect:true mutex thunk
+Mutex.use_rw mutex (Cancel.protect thunk)

I guess the intention with the API is to force the user to think about cancellation.

The names, use_rw and use_ro chosen for the operations are not perhaps the most "comprehensive". It is quite possible to have a writing critical section that is still safe to cancel. What is needed is that state is consistent at the points where cancellation might happen.

For example, the common pattern using Condition.await

Mutex.lock mutex;
Fun.protect ~finally:(fun () -> Mutex.unlock mutex) begin fun () ->
  (* modify or examine state, never enter scheduler *)
  while (* some condition *) do
    (* await unlocks the mutex, so state needs to be consistent at this point *)
    Condition.await condition mutex
  done
  (* modify or examine state, never enter scheduler *)
end

should be safe as long as the Condition.await is the only point that enters the scheduler and might raise the cancellation exception.


As background information for others, I've been looking into the synchronization structures of Eio to see how one could implement them using compositional lock-free algorithms (using kcas).

polytypic avatar Apr 13 '23 13:04 polytypic

Yes, it's trying to get users to think about two things:

  • Whether to poison the mutex if an unexpected exception occurs.
  • Whether to allow cancellation.

I'm not super happy about the names either, especially as it makes it sound like a RW lock.

talex5 avatar Apr 14 '23 10:04 talex5