Zero copy file transferring (via FileRegion) is disabled for IoUring transport.
Hello dear Vert.x team.
As I see, in the IoUringTransport the support of file region is disabled by
public class IoUringTransport implements Transport {
...
@Override
public boolean supportFileRegion() {
return false;
}
...
which mean that Http1xServerResponse.sendFile function will always fallback to read / write model instead of zero copy.
And as Netty have direct support for file regions over io_uring (via IoUringFileRegion) I think it is good idea to enable this support for Vert.x as well, what do you think?
P.S. I tried to "hack" transport just to make this getter return true (via Unsafe) and IoUringFileRegion start to work BUT it has constant fd leak, actually they were never closed and I thought that probably it is bug of IoUringFileRegion but I check in reactor-netty lib and it utilizing IoUringFileRegion don't have leak which mean my hack was not enough and we also need somewhere handle FileRegion closure / releasing and as you have super huge experience with Netty and DefaultFileRegion I think handling IoUringFileRegion will not cause big problems for you.
Thank you for bringing such perfect software as Vert.x to JVM community)
@doxlik have you tried modifying your own build of vertx to make this work ? I can see that io uring file region are created by AbstractIoUringStreamChannel#filterOutboundMessage that automatically turns default file region into uri file region when enabled
I'm wondering also whether vertx transport should not return IoUring.isSpliceSupported() instead
@vietj Hi, thank you for the investigation.
I think I found the issue. I will try to clarify each point; if anything is unclear, feel free to ask.
-
Unfortunately, due to lack of time, I initially made it work via an unsafe hack (forcing the method to return
true) instead of doing a proper local build. -
The functionality itself actually works automatically for
DefaultFileRegion— Netty internally turns it intoIoUringFileRegion. However, the main problem is an FD leak. (P.S. I found the cause — described below.) -
Regarding
IoUring.isSpliceSupported(): this is indeed a good idea. As far as I understand, if we report file-region support but splice is not supported, it will lead to an error. So this condition looks correct for the final implementation.
P.S. In my environment it did not fail because splice support was auto-detected by Netty and was available on my system.
Main point
While simply changing IoUringTransport#supportFileRegion to return IoUring.isSpliceSupported() does enable the zero-copy path, there is still a constant FD leak. After long debugging, I found the issue here in IoUringFileRegion:
final class IoUringFileRegion implements FileRegion {
...
final DefaultFileRegion fileRegion;
...
IoUringFileRegion(DefaultFileRegion fileRegion) {
this.fileRegion = fileRegion;
}
...
@Override
public boolean release() {
if (fileRegion.release()) {
closePipeIfNeeded();
return true;
}
return false;
}
private void closePipeIfNeeded() {
if (pipe != null) {
closeSilently(pipe[0]);
closeSilently(pipe[1]);
}
}
It closes pipes only if fileRegion.release() returns true.
During debugging I observed it returning false, so no pipes are closed and the FDs leak.
Then I executed the request again. Before the if, I manually called fileRegion.release() in the debugger and it returned false. Later, when Netty called fileRegion.release(), it returned true and the FDs were successfully closed. This suggests the refCount of the original file region is incremented by one more than necessary, so the first release attempt does not actually free resources.
I believe the root cause is in VertxConnection#sendFileRegion:
private void sendFileRegion(FileChannel fc, long offset, long length, ChannelPromise writeFuture) {
if (length < MAX_REGION_SIZE) {
FileRegion region = new DefaultFileRegion(fc, offset, length);
// Retain explicitly this file region so the underlying channel is not closed by the NIO channel when it
// as been sent as the caller can need it again
region.retain();
writeToChannel(region, writeFuture);
} else {
ChannelPromise promise = chctx.newPromise();
FileRegion region = new DefaultFileRegion(fc, offset, MAX_REGION_SIZE);
// Retain explicitly this file region so the underlying channel is not closed by the NIO channel when it
// as been sent as we need it again
region.retain();
...
}
}
This retain() keeps the original region alive and prevents the correct release. I do see the comments, and this behavior does not reproduce for NIO and Epoll — only for io_uring. Most likely Netty handles the extra refCount internally in the DefaultFileRegion but it doesn't help to IoUringFileRegion because its release is called only once and does nothing as original file region's release returns false at this time.
Proposed steps
- Make
IoUringTransport#supportFileRegionreturnIoUring.isSpliceSupported(). - Avoid
fileRegion.retain()inVertxConnection#sendFileRegionfor IoUringTransport, otherwise the FD leak persists.
P.S. I monitored open FDs using:
watch -n 1 "lsof -p $PID -Fn | grep '^n' | wc -l"