msquic
msquic copied to clipboard
Linux datapath does not activate GRO although available
Describe the bug
When using the Linux datapath datapath_epoll.c on a system that supports both UDP GSO and GRO, the feature test in CxPlatDataPathCalculateFeatureSupport only activates send segmentation and falls back to "normal" recvmmsg for the receive side.
Hardcoding the support for receive coalescing in Datapath->Features shows that GRO does indeed work on the system and significantly improves achievable goodput (5.2 GBit/s vs. 8.3 GBit/s during an 8 GiB file transfer on the test system described below).
The problem seems to be that no UDP_GRO control message is received after sending a message with GSO to the loopback interface. This is possibly related to https://github.com/microsoft/msquic/issues/3865, where the test using the loopback interface also lead to problems.
Affected OS
- [ ] Windows
- [X] Linux
- [ ] macOS
- [ ] Other (specify below)
Additional OS information
Debian 11 with Linux kernel 5.10.0-23-amd64
MsQuic version
v2.2
Steps taken to reproduce bug
- Use MsQuic's default choice for GSO/GRO support and evaluate performance during a file transfer
- Hardcode support for receive coalescing, e.g., add the following line here
Datapath->Features |= CXPLAT_DATAPATH_FEATURE_RECV_COALESCING; - Repeat the measurement and compare performance
Expected behavior
MsQuic should activate receive coalescing in both cases, leading to no performance differences.
Actual outcome
Performance increases significantly when receive coalescing is enforced.
Additional details
Test system:
- Two machines running Debian 11 with Linux kernel 5.10.0-23-amd64
- NICs are Intel E810-C running on ICE driver v1.12.6
- GSO and GRO are activated for through
ethtoolfor both the link to the other node andlo
the feature test in
CxPlatDataPathCalculateFeatureSupportonly activates send segmentation and falls back to "normal" recvmmsg for the receive side.
What is failing in that function?
What is failing in that function?
There is not really an error in the function, but it does not activate receive coalescing (on my test system) although the feature is supported. Thus, performance is worse than it could be.
I noticed that this line is not reached, so the test with the loopback interface seems to not generate the UDP_GRO control message MsQuic is searching for.
What is failing in that function?
There is not really an error in the function, but it does not activate receive coalescing (on my test system) although the feature is supported. Thus, performance is worse than it could be. I noticed that this line is not reached, so the test with the loopback interface seems to not generate the
UDP_GROcontrol message MsQuic is searching for.
Without that, we cannot know how to segment a GRO receive. So if that's not getting delivered, we shouldn't use the feature.
I'll need your help to understand a bit more what's going on. Does GRO not work over loopback, but somehow work over your external NIC?
What is failing in that function?
There is not really an error in the function, but it does not activate receive coalescing (on my test system) although the feature is supported. Thus, performance is worse than it could be. I noticed that this line is not reached, so the test with the loopback interface seems to not generate the
UDP_GROcontrol message MsQuic is searching for.Without that, we cannot know how to segment a GRO receive. So if that's not getting delivered, we shouldn't use the feature.
I'll need your help to understand a bit more what's going on. Does GRO not work over loopback, but somehow work over your external NIC?
Unfortunately, I'm not sure why the control message is not getting delivered in the loopback test. GRO does work with my external NIC and is also producing the correct control messages.
Do you have a suggestion on how to handle this scenario then? AFAIU, there isn't a good way to know without trying and failing if GRO is supported. That's why we have the current logic, but it assumed support was interface agnostic. So we test loopback on start up to see if things work. If that doesn't work, but some other interfaces do, I'm not sure what to do. We can't easily test an external NIC without a peer.
But support is not interface agnostic, is it? AFAIU, this was also the problem with this issue regarding GSO: https://github.com/microsoft/msquic/issues/3865
I have not tested this, but is the test for GRO really necessary? As long as the UDP_GRO macro is defined, the recv call should only coalesce messages if this is supported. Otherwise, it should only yield single messages.
But support is not interface agnostic, is it? AFAIU, this was also the problem with this issue regarding GSO: #3865
I have not tested this, but is the test for GRO really necessary? As long as the
UDP_GROmacro is defined, the recv call should only coalesce messages if this is supported. Otherwise, it should only yield single messages.
A possible problem is, however, that then we can't receive multiple messages with recvmmsg, as we don't fall back.
I don't have a good solution, nor the ability to test all these different scenarios right now. Feel free to propose a PR if you have an idea on how to go forward, and we can let the automation try it out, and ask a few others to do so as well.
I won't be able to look into that in the next days, but I will let you know if I find a solution :+1:
Do you have a suggestion on how to handle this scenario then? AFAIU, there isn't a good way to know without trying and failing if GRO is supported. That's why we have the current logic, but it assumed support was interface agnostic. So we test loopback on start up to see if things work. If that doesn't work, but some other interfaces do, I'm not sure what to do. We can't easily test an external NIC without a peer.
In Tailscale we trust getsockopt() for UDP_GRO to confirm the kernel has support. If we don't get an error (EINVAL IIRC if a socket option is invalid) we proceed with setsockopt(). Technically we could condense that to just setsockopt() with no get, but broken apart is helpful for unrelated reasons. This is all to say, is the loopback testing necessary if it can lead to false negatives? I may be missing background/history on why loopback testing was chosen to start.