liburing
liburing copied to clipboard
Socket close in combination with a read/recv is not happening.
What is the expected behavior when a close on a socket happens and there is a pending read/recv?
In my case, I have a io_uring recv on a socket that is waiting for input.
I'm closing the socket; either I call close(fd) directly or I call close through io_uring; in both cases, the close is completed with res=0. So no errors.
I'm observing 2 problems:
-
The io_uring recv call doesn't complete with an error. It just stalls. I guess I need to flush out these calls with an IORING_OP_ASYNC_CANCEL?
-
no FIN/RST packet is sent by the socket, so the other side isn't notified.
When I don't issue the io_uring recv call, closing either through io_uring or using an explicit close(fd) works perfectly fine and I can see the FIN packet being sent (wireshark).
So it seems that the io_uring recv call is preventing the socket from terminating. Is this expected behavior?
I'm using: Linux 6.4.11-arch2-1
Yes, it is expected. You can either cancel inflight requests or do shutdown(2), which is always recommended.
Note: the same goes for all files, not only sockets. Regretfully there is no non-socket generic shutdown(2)
The longer story: it comes down to file refcounting. close(2) puts one ref, but io_uring requests have a ref, so the socket won't be actually destroyed until all requests are cancelled. Same with recv(2) and other traditional variants, take a ref for the duration of the call.
So you first issue a shutdown which forces all requests for that socket to complete (with some error). Once all the requests have returned, then the following step is to issue an uring close?
I just implemented the shutdown->close flow and it seems to work.
Thanks a lot. I have been banging my head on this for the last few days.
I can see something similar here, on Netty https://github.com/netty/netty-incubator-transport-io_uring/issues/194#issue-1593643674
The issued close won't complete till poll remove is called. When it happen, the order of completion will be close first than a "errored" poll in (effect of the issued poll remove)
Not issuing the poll remove won't make the close to complete
What does shutdown() do to io_uring flying requests?
I have been annoyed with this phenomenon when the remote TCP endpoint is either gracefully shutdown its side or a RST segment has been received.
When it does happen, I close the socket and I create a new socket to reconnect. So from that point, io_uring sometimes does return cqe that relates to the previous socket incarnation. I can detect those old cqe and silently discard them with the user data that they contain and it is all fine.
Now, I am entertaining the idea of experimenting with IORING_OP_SENDMSG_ZC and IORING_RECVSEND_FIXED_BUF.
Suppose that there are flying submissions for which IORING_CQE_F_NOTIF has not been seen yet.
While in that state, the user closes the socket.
Is it safe to unregister and dispose of the buffer used by flying IORING_OP_SENDMSG_ZC submissions after the close() call?
What does shutdown() do to io_uring flying requests?
Same thing it does in non-io_uring cases, it cancels them.
I have been annoyed with this phenomenon when the remote TCP endpoint is either gracefully shutdown its side or a RST segment has been received.
When it does happen, I close the socket and I create a new socket to reconnect. So from that point, io_uring sometimes does return cqe that relates to the previous socket incarnation. I can detect those old cqe and silently discard them with the user data that they contain and it is all fine.
Now, I am entertaining the idea of experimenting with IORING_OP_SENDMSG_ZC and IORING_RECVSEND_FIXED_BUF.
Suppose that there are flying submissions for which IORING_CQE_F_NOTIF has not been seen yet.
While in that state, the user closes the socket.
Is it safe to unregister and dispose of the buffer used by flying IORING_OP_SENDMSG_ZC submissions after the close() call?
No. And it wouldn't be safe even after a shutdown, it's safe only after getting a IORING_CQE_F_NOTIF
(if the first CQE had been marked with F_MORE
).