saturn Michael-Scott queue : safe

This PR follows PR #122 work: the optimizations or changes made in this previous PR are separated into two to get a safe Michael-Scott queue and an "unsafe" one (with Obj.magic use).

TODO :

[x] For now, the dscheck tests for the unsafe version get segmentation faults.
[x] Documentation

Jul 11 '24 20:07 lyrm

@polytypic: The dscheck tests of the unsafe version result in a segmentation fault. I encountered the same issue with PR #122 after rebasing on main and adding the few necessary changes. Could you have a look?

Jul 23 '24 15:07 lyrm

@polytypic : this PR is ready for review. Could you have a look ?

Jul 29 '24 17:07 lyrm

Looking at the "safe" version, the push logic:

(* ... *)

let push { tail; _ } value =
  let rec find_tail_and_enq curr_end node =
    if not (Atomic.compare_and_set curr_end Nil node) then
      match Atomic.get curr_end with
      | Nil -> find_tail_and_enq curr_end node
      | Next (_, n) -> find_tail_and_enq n node
  in
  let new_tail = Atomic.make Nil in
  let newnode = Next (value, new_tail) in
  let old_tail = Atomic.get tail in
  find_tail_and_enq old_tail newnode;
  if not (Atomic.compare_and_set tail old_tail new_tail) then
    fix_tail tail new_tail

doesn't have some of the (entirely safe) optimizations from the "unsafe" version:

(* ... *)

let push { tail; _ } value =
  let (Next _ as new_node : (_, [ `Next ]) Node.t) = Node.make value in
  let old_tail = Atomic.get tail in
  let link = Node.as_atomic old_tail in
  if Atomic.compare_and_set link Nil new_node then
    Atomic.compare_and_set tail old_tail new_node |> ignore
  else
    let backoff = Backoff.once Backoff.default in
    push tail link new_node backoff

In particular, notice how the "unsafe" version does not call any recursive functions in the fast path:

let push { tail; _ } value =
  let (Next _ as new_node : (_, [ `Next ]) Node.t) = Node.make value in
  let old_tail = Atomic.get tail in
  let link = Node.as_atomic old_tail in
  (* first we try the fast path *)
  if Atomic.compare_and_set link Nil new_node then
    Atomic.compare_and_set tail old_tail new_node |> ignore
  else
    (* failed, so we do the slow path *)

It should be possible to incorporate this optimization into the "safe" version and it might speed it up a tiny bit.

Jul 31 '24 11:07 polytypic

About the missing safe optimizations: I added the fast path in the push function. However, the benchmarks suggest that adding backoff (either for in fix_tail or find_tail_and_end) does not improve performance (at least on my computer). Did I miss other safe optimizations ?

Jul 31 '24 16:07 lyrm

About the missing safe optimizations: I added the fast path in the push function. However, the benchmarks suggest that adding backoff (either for in fix_tail or find_tail_and_end) does not improve performance (at least on my computer). Did I miss other safe optimizations ?

Hmm... One could also leave the backoff out. It is possible that the backoff helps on other CPU (micro)architectures and we can add that later.

Aug 01 '24 10:08 polytypic

Hi there! While I understand and appreciate the hard work that went into optimizing the queue, what is the recommended alternative to grab the elements of the queue not that snapshot are gone? A pop loop followed by a push loop?

Aug 26 '24 02:08 toots

As a side note, the removal of the snapshot mechanism wasn't just about optimization.

Here is a puzzle about the snapshot mechanism. The code below uses the previous 0.4.1 version of this library, which is the only published version of this library that has the MS queue with the snapshot mechanism and no (additional) space leaks. Let's first define a module alias:

module Q = Saturn.Queue

Here is a function that converts a "snapshot" into a list:

let[@tail_mod_cons] rec to_list s =
  match Q.next s with
  | None -> []
  | Some (v, s) -> v :: to_list s

Question A: What is the return value of the below program?

let q = Q.create () in
let s = Q.snapshot q in
Q.push q 101;
ignore (Q.pop q);
to_list s

Question B: What is the return value of the below program?

let q = Q.create () in
Q.push q 101;
let s = Q.snapshot q in
ignore (Q.pop q);
Q.push q 42;
ignore (Q.pop q);
to_list s

Click to see answer and explanation

The first program returns [] and the second program returns [101; 42]. Note that there is no point in time when the queue in the second program contains both 101 and 42.

To me, a "snapshot" should be an immutable instant. But that is not how the mechanism worked. Instead, snapshot simply returned a pointer to the first non-dummy (if any) node of the queue and the next operation then dereferenced the (mutable) pointer to the next node.

This meant that if the queue happened to be empty, the snapshot would be empty. If the queue happened to be non-empty, the snapshot would contain all elements pushed to the queue at that point and potentially all the element to be pushed to the queue in the future.

This behaviour of the snapshot mechanism is, of course, not limited to simple sequential programs. Concurrent pushes to the queue are also potentially visible through the "snapshot".

Aug 26 '24 09:08 polytypic

Good to know! Given the nature of the concurent queue, I was never expecting a strict control over the interleaving calls but this is clearly opposite to what it was advertising.

I think it'd be nice to have a way to get through the elements in the queue without removing them, basically a deeper peek API.

I might see if I can propose a PR for that.

Aug 26 '24 16:08 toots

Note that providing a snapshot operation can be easier for some other lock-free queue approaches.

The "lazy" queue, for example, makes it particularly easy.

The approach using two (essentially immutable) stacks also makes it relatively easy to take a snapshot of the queue. The Picos project has an implementation of a two-stack multi-producer, multi-consumer queue that should be relatively easy to extend with a snapshot operation.

Also, the queue provided with Kcas has snapshot functionality.

Aug 27 '24 12:08 polytypic

Michael-Scott queue : safe - unsafe versions