Sharing a Uring from a Parent to a Child Process
Hi!
Whilst trying to add some tests for an OCaml library I ran into this weird bug with Eio_linux. The smallest reproducible I have right now is the following code:
let test fs _ =
ignore (Eio.Path.(load (fs / "./README.md")))
let suite fs =
let open OUnit2 in
"Tess" >:: (test fs)
let () =
Eio_linux.run @@ fun env ->
OUnit2.run_test_tt_main (suite env#fs)
On my machine this program hangs.
% uname -rv
6.6.82 #1-NixOS SMP PREEMPT_DYNAMIC Sun Mar 9 08:55:04 UTC 202
Switching to Eio_posix and everything is good again.
I've been trying to debug this and so far I have the following findings:
- We get stuck in a Uring.cancel that can never get placed on the queue as there is no space.
- The Uring submission queue seems to be completely full and perhaps if I left it running long enough it will complete, but something is going wrong and that shouldn't happen. This could be a ocaml-uring bug. By completely full it seems
sqe_readyis reporting4294967165which doesn't seem right either, perhaps there is some corruption going on there or something.
Happy to help debug this if there are any pointers :)
I added a couple of printfs to liburing and got this right at the start:
__io_uring_flush_sq: tail=1 head=3
__io_uring_submit(3, -2, 0, 0)
(submitted is unsigned, but I printed it as signed)
Probably best to step through it with gdb and see what's happening.
Also, perhaps ounit is forking subprocesses? That might be causing odd effects.
Indeed, the default behaviour for ounit2 is to use Unix.fork. If you pass -runner sequential to the executable, everything works. Is this expect behaviour? I'm pretty sure I remember people saying you can't use Unix.fork in multi-domain programs but this isn't as far as I know.
A smaller repro with just Uring
let queue_read uring fd buf =
let req = `R in
let job = Uring.read ~file_offset:Optint.Int63.zero uring fd buf req in
job, req
let () =
let uring = Uring.create ~queue_depth:64 () in
let read_to_never_complete, read_request_to_never_complete =
match queue_read uring Unix.stdin (Cstruct.create 1) with
| Some j, req -> j, req
| None, _ -> assert false
in
let rec cancel () =
match Uring.cancel uring read_to_never_complete read_request_to_never_complete with
| Some _ -> Format.printf "Job cancelled\n%!"
| None ->
Format.printf "Cancel not queued\n%!";
let _ : int = Uring.submit uring in
Format.printf "SQE READY: %i\n%!" (Uring.sqe_ready uring);
cancel ()
in
let rd, wr = Unix.pipe () in
match Unix.fork () with
| 0 ->
(* Do something to the Uring to change the head and tail of the queue *)
let fd = Unix.openfile Sys.argv.(1) [ Unix.O_RDONLY ] 0o644 in
let job, _req = queue_read uring fd (Cstruct.create 1) in
let job2, _req = queue_read uring fd (Cstruct.create 1) in
assert (Option.is_some job && Option.is_some job2);
let i = Uring.submit uring in
assert (i = 3); (* job, job2 and read_to_never_complete *)
let _ : int = Unix.write_substring wr "1" 0 1 in
()
| _ ->
(* Wait for child and then try to cancel the random_read *)
let buf = Bytes.create 1 in
let _ = Unix.read rd buf 0 1 in
Format.printf "Got: %s" (Bytes.to_string buf);
cancel ()
(I don't have the ability to move this issue over the ocaml-uring)
Having both child and parent use the ring after forking seems very likely to fail. Is ounit doing that? Typically the child needs to avoid using the same resources as the parent (e.g. buffered readers), and must exit with Unix._exit to avoid cleaning up the parent's resources.
Should we just take a hammer to this failure case by registering a pthread_atfork handler in the uring bindings to invalidate the ring in the child process? It seems reasonable to insist that any child processes re-initialise a fresh ring to the kernel rather than inherit the parent one.
They mmap the uring here: https://github.com/axboe/liburing/blob/master/test/across-fork.c
Yes, seems reasonable to warn about child processes using the parent's ring by default. Might be a bit of work though; you'd probably need to keep a C linked-list of urings to invalidate or something.