eio
eio copied to clipboard
Eio.Flow.read hangs on closed flow
Not sure if I'm doing something wrong, but I've run into the following issue (on the Luv backend).
A fiber waiting on a Eio.Flow.read
before another fiber closes it will hang forever, at least for a client type Eio.Net.connect
flow.
I've managed to trim it down to the following case:
reproduce.ml
open Eio.Std
let main env =
let run_client ~net ~addr =
traceln "client on";
Switch.run (fun sw ->
let flow = Eio.Net.connect ~sw net addr in
traceln "client <-> server";
let read_loop () =
let b = Cstruct.create 4 in
traceln "client will be stuck after this";
while true do
Eio.Flow.read flow b |> ignore;
traceln "unreachable"
done
in
let d = "Hello from client" in
traceln "client -> %S" d;
Eio.Flow.copy_string d flow;
Fiber.fork ~sw read_loop;
Eio.Flow.close flow);
traceln "client off"
in
let run_server socket =
traceln "server on";
Switch.run (fun sw ->
Eio.Net.accept_sub socket ~sw
(fun ~sw:_ flow _addr ->
traceln "server <-> client";
let b = Buffer.create 100 in
Eio.Flow.copy flow (Eio.Flow.buffer_sink b);
traceln "server <[EOF]- %S" (Buffer.contents b))
~on_error:(traceln "Error handling connection: %a" Fmt.exn));
traceln "server off"
in
let net = Eio.Stdenv.net env in
let addr = `Tcp (Eio.Net.Ipaddr.V4.loopback, 9999) in
Switch.run @@ fun sw ->
let server = Eio.Net.listen net ~sw ~reuse_addr:true ~backlog:5 addr in
Fiber.both (fun () -> run_server server) (fun () -> run_client ~net ~addr)
Where the expected output would be:
+server on
+client on
+client <-> server
+client -> "Hello from client"
+server <-> client
+client will be stuck after this
+server <[EOF]- "Hello from client"
+server off
*hangs forever*
Indeed, it seems that libuv does not call the read callback in this case:
let or_fail = function
| Ok x -> x
| Error e -> failwith (Luv.Error.strerror e)
let () =
let a, b = Luv.TCP.socketpair `STREAM 0 |> or_fail in
let tcp_a = Luv.TCP.init () |> or_fail in
let tcp_b = Luv.TCP.init () |> or_fail in
Luv.TCP.open_ tcp_a a |> or_fail;
Luv.TCP.open_ tcp_b b |> or_fail;
Luv.Stream.read_start tcp_a (fun _ -> print_endline "read a ready!");
print_endline "close";
Luv.Handle.close tcp_a ignore;
let t = Luv.Timer.init () |> or_fail in
Luv.Timer.start t 1000 (fun () -> print_endline "timer done") |> or_fail;
ignore (Luv.Loop.run () : bool)
Output is just "close/timer done" (no "read a ready").
strace shows:
epoll_ctl(5, EPOLL_CTL_DEL, 3, 0x7ffd474327f4) = -1 ENOENT (No such file or directory)
close(3) = 0
It would be nice if this worked, but (depending on what you're trying to do) it's probably cleaner to replace the close with shutdown, e.g.
Eio.Flow.shutdown flow `Send
Then the server will read end-of-file and display the message from the client, then close its own connection, which will cause the client to read end-of-file. If the client wants to cancel its own read, it can use Eio's cancellation (Switch.fail
, etc) rather than relying on closing the FD to do it.
Thanks, using shutdown
instead does seem to sidestep the issue 👍
Funny thing, for reference, the source of my problem ended up being a Tls module loosely based on yours: https://gitlab.com/talex5/gemini-eio/-/blob/master/src/tls_eio.ml#L128
Replacing that close
call with a shutdown `Send
made everything work 🙌
Funny thing, for reference, the source of my problem ended up being a Tls module loosely based on yours:
Hmm, yes, that Tls wrapper should probably now implement Flow.two_way
and provide a shutdown
operation that sends the close notify and then calls shutdown
on the underlying flow.
Then Tls_eio.close
can just close the underlying flow immediately
In fact, I think continuing to wait is the correct behaviour. close
is an operation on the file descriptor, not on the file (socket) itself. The socket itself is shut down when there are no more users of it, but the read is a valid user.
Closing, as not a bug.
Note: there's now also a proper tls-eio package (https://github.com/mirleft/ocaml-tls/pull/451).