eio
eio copied to clipboard
Nested Fiber.any race condition
Hey all, thanks for the awesome work on Eio!
I have a weird race condition with nested Fiber.any and I don't quite understand it.
Here is a simplified example of the code that reproduces the situation:
Eio_main.run (fun env ->
let promise_r, promise_u = Eio.Promise.create () in
let res =
Eio.Fiber.any
~combine:(fun x y -> x + y)
[
(fun () ->
(* function 1 *)
(* let the second function start *)
Eio.Time.sleep env#clock 1.0;
Eio.Promise.resolve promise_u 1;
Printf.printf "Promise resolved\n%!";
10);
(fun () ->
Eio.Fiber.any
~combine:(fun x y -> x + y)
[
(fun () ->
(* function 2 *)
Eio.Time.sleep env#clock 10.;
2);
(fun () ->
(* function 3 *)
let x = Eio.Promise.await promise_r in
Printf.printf "Promise returned\n%!";
x);
]);
]
in
Printf.printf "Result: %d\n" res)
The output from this is:
Promise resolved
Promise returned
Result: 10
Although, what I would expect from this is Result: 11. From my understanding, when the promise is resolved and "function 1" returns and other function are being checked if they are "unlocked". The "function 3" is unlocked and it returns, so it would seem like the nested Fiber.any should return in a whole and combine on the first Fiber.any should be invoked with values 10 and 1.
But seems like the second Fiber.any is cancelled and only the first function actually returns.
Can someone explain what's happening here?
It's due to Fiber.any explicitly checking whether it's been cancelled before returning:
https://github.com/ocaml-multicore/eio/blob/8f7f82d2c12076af8e9b8b365c58ebadaa963b8c/lib_eio/core/fiber.ml#L152-L154
I think it might be reasonable to change that to:
match !r, Cancel.get_error parent_c with
| OK r, _ -> r
| New, Some ex -> raise ex
That would prioritise not losing a calculated value, rather than cancelling quickly.
This patch did solve our problem @talex5. Do you think there are any negative externalities of this change?
I don't think so.
Originally, Fiber.any didn't have a combine argument, so it was always expected that it was OK to throw away either result, and aborting early made sense. But with combine that's no longer true.