msquic
msquic copied to clipboard
Possibility for datagrams to be dropped when not able to send
Description
When adding a Token-Bucket filter and sending more data than we are allowed to, I found out that there was a lot of delay (the more we wait the more we have delay) between sending and receiving. After some discussion, it was established that msquic is buffering datagrams when it can't send them immediately. With this PR, it is now possible to drop them when you can't send them immediately. I added a flag to the QUIC_SEND_FLAGS : "QUIC_SEND_FLAG_DGRAM_CANCEL_ON_BLOCKED" When this flag is set, on sending it will check if there is queued messages to send, if there is, it drops (cancels) them.
Testing
Do any existing tests cover this change?
I don't think so.
Are new tests needed?
Maybe...
Documentation
Is there any documentation impact for this change?
Just the new flag : "QUIC_SEND_FLAG_DGRAM_CANCEL_ON_BLOCKED"
Some more infos
This is the one-way delay of some datagrams with a TBF after sending more than we are allowed to, see images.
Without flag (before patch):
With flag:
The values are like this : "{packet number}, {one-way delay in ms}" The values with the flag are around the same values as I have with UDP (which normally drops packets when it can't send them).
@microsoft-github-policy-service agree
Codecov Report
Attention: Patch coverage is 90.90909% with 2 lines in your changes missing coverage. Please review.
Project coverage is 86.06%. Comparing base (
9610803) to head (0090326). Report is 5 commits behind head on main.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| src/core/datagram.c | 90.47% | 2 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #4320 +/- ##
==========================================
+ Coverage 85.71% 86.06% +0.34%
==========================================
Files 56 56
Lines 17378 17400 +22
==========================================
+ Hits 14896 14975 +79
+ Misses 2482 2425 -57
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Are new tests needed?
Maybe...
Yeah, we will need to add some tests, as well as make the minor edits to the docs.
You will also need to run .\scripts\generate-dotnet.ps1 to update the .NET files.
The changes looks good, but how do should we add a test to verify it? One minimum option is to update spinquic.cpp to ensure it uses this new flag (should just require changing one number). But ideally, it'd be nice to have a functional test case.
The changes looks good, but how do should we add a test to verify it? One minimum option is to update spinquic.cpp to ensure it uses this new flag (should just require changing one number). But ideally, it'd be nice to have a functional test case.
Maybe we should just add more than we can send i.e a lot of DatagramSend stuck in a loop until we reach out the MTU which is 1500 if we are over ethernet or wifi (so maybe 2 datagrams of a thousand bytes then multiple datagrams that we can drop) or afaik ~65000 for localhost (2 datagrams of 45 000 bytes and multiple datagrams that we can drop) ?
I don't know if this would work, otherwise we could just create a new CI and run a QUIC server on it with special traffic control rules (i.e bandwidth limit) and check if some packets that we allowed to be dropped are dropped and the others are not but the outputs would be different for each run and wouldn't give us a general idea.
The changes looks good, but how do should we add a test to verify it? One minimum option is to update spinquic.cpp to ensure it uses this new flag (should just require changing one number). But ideally, it'd be nice to have a functional test case.
Maybe we should just add more than we can send i.e a lot of DatagramSend stuck in a loop until we reach out the MTU which is 1500 if we are over ethernet or wifi (so maybe 2 datagrams of a thousand bytes then multiple datagrams that we can drop) or afaik ~65000 for localhost (2 datagrams of 45 000 bytes and multiple datagrams that we can drop) ?
I don't know if this would work, otherwise we could just create a new CI and run a QUIC server on it with special traffic control rules (i.e bandwidth limit) and check if some packets that we allowed to be dropped are dropped and the others are not but the outputs would be different for each run and wouldn't give us a general idea.
I think a test that queues up more than 10 1000-byte datagrams before it starts a connection, and then calls connection start might trigger the discard notification.
Quick question on this PR, even though some tests have failed (because they were too long to run or because they had their connection was interrupted (?) ) do I have something more to do? Like, do I have to fix the failed tests? I saw that it was the same thing for everyone about these tests so I don't know?
I'm not sure what happened, but it looks like you pushed a lot of unexpected new files.
I'm not sure what happened, but it looks like you pushed a lot of unexpected new files.
Other than merging, I did nothing else :/
I think the problem is the msquicdocs folder shouldn't be in the main branch. Can you please remove that?
Can you please update the Rust bindings?
--- a/src/ffi/linux_bindings.rs
+++ b/src/ffi/linux_bindings.rs
@@ -390,6 +390,7 @@ pub const QUIC_SEND_FLAGS_QUIC_SEND_FLAG_DGRAM_PRIORITY: QUIC_SEND_FLAGS = 8;
pub const QUIC_SEND_FLAGS_QUIC_SEND_FLAG_DELAY_SEND: QUIC_SEND_FLAGS = 16;
pub const QUIC_SEND_FLAGS_QUIC_SEND_FLAG_CANCEL_ON_LOSS: QUIC_SEND_FLAGS = 32;
pub const QUIC_SEND_FLAGS_QUIC_SEND_FLAG_PRIORITY_WORK: QUIC_SEND_FLAGS = 64;
+pub const QUIC_SEND_FLAGS_QUIC_SEND_FLAG_CANCEL_ON_BLOCKED: QUIC_SEND_FLAGS = 128;
pub type QUIC_SEND_FLAGS = ::std::os::raw::c_uint;
pub const QUIC_DATAGRAM_SEND_STATE_QUIC_DATAGRAM_SEND_UNKNOWN: QUIC_DATAGRAM_SEND_STATE = 0;
pub const QUIC_DATAGRAM_SEND_STATE_QUIC_DATAGRAM_SEND_SENT: QUIC_DATAGRAM_SEND_STATE = 1;
Sorry for the spam, I think I needed to add the bindings to the win file also (some tests were failing)