s2n-quic
s2n-quic copied to clipboard
Experiencing lower throughput the higher the latency
Problem:
As part of my thesis, I'm using a self-written perf
client available here based on s2n-quic
with s2n-quic-qns
as server in a testbed setup with configurable bandwidth limit and link delay at a router. I experience a reduction in throughput the higher the configured delay, whereas MsQuic
's secnetperf
tool doesn't suffer this reduction.
While limited to 200Mbit/s
at 20ms
delay, I'm able to achieve ~190Mbit/s
throughput, at 250ms
delay the throughput falls below 20Mbit/s
average.
I'm also logging RecoveryMetrics
and the CWND behavior looks odd to me, where it ramps up to an upper limit and then doesn't seem to exhibit the CUBIC behavior I would have expected.
Testbed
Note that the bandwidth limit is imposed on the path to the receiver, while the delay is applied in the direction from receiver to sender.
Visualization 20ms Please disregard the empty graphs in some of the figures, they are only populated in some other diagram configurations if data was available.
My client:
MsQuic:
Visualization 50ms
My client:
MsQuic:
Visualization 250ms
My client:
MsQuic:
Solution:
I don't have a solution for the issue yet. I'm looking for feedback on whether this is a known / expected behavior at the moment, whether this is user-error on my part and if and how I can help troubleshoot this.
To rule out an error in my client, an official perf
client implementation might be helpful.
-
Does this change what s2n-quic sends over the wire? Probably not
-
Does this change any public APIs? Probably not
Requirements / Acceptance Criteria:
- RFC links: None
- Related Issues: None
- Will the Usage Guide or other documentation need to be updated? Unknown
- Testing: A performance dashboard like MsQuic's WAN Perf would be nice
Thank you for the issue! I am able to reproduce this locally with tc qdisc
as well with our monte carlo simulation. I'm going to look into it and figure out what's happening.
Thank you very much for looking into it :)
Sorry for the delayed response; it's been a busy couple of weeks.
So I actually think this behavior is to be expected. As RTT increases, the required amount of buffering on the sender also increases. We have mechanisms in place to prevent the sender from buffering too much and running out of memory, and they appear to be working as intended. In the presence of buffering limits, the relationship between thoughput and RTT ends up fitting a throughput = RTT^-2
curve:
This means every time RTT doubles, throughput is halved. It's also good to keep in mind that all of the delayed queues on the network's path need to increase in order to sustain throughput at higher RTT values. In fact, the netem
documentation calls this out:
Rate control
The netem qdisc can control bandwidth using the rate feature. Be careful to ensure that the netem qdisc limit is large enough to include packets in both the emulated link and the emulated buffer:
DELAY_MS=40 RATE_MBIT=10 BUF_PKTS=33 BDP_BYTES=$(echo "($DELAY_MS/1000.0)*($RATE_MBIT*1000000.0/8.0)" | bc -q -l) BDP_PKTS=$(echo "$BDP_BYTES/1500" | bc -q) LIMIT_PKTS=$(echo "$BDP_PKTS+$BUF_PKTS" | bc -q) tc qdisc replace dev eth0 root netem delay ${DELAY_MS}ms rate ${RATE_MBIT}Mbit limit ${LIMIT_PKTS}
In #1345, I added a perf client implementation to s2n-quic-qns with the option of specifying these buffer limits: https://github.com/aws/s2n-quic/blob/d6d144675f0237928690fff20ad9ef57547c36fc/quic/s2n-quic-qns/src/perf.rs#L101-L109
These values can be increased, which should remove the buffering limits as the bottleneck in your scenario.
Thanks for the detailed reply, it's been pretty busy here as well. I'll probably run the experiments again next week with your new client and report back :)
Sorry for the delay, I was busy with other areas of my thesis before getting back to experiments.
I have now re-run the experiments with s2n-quic-qns
(built from the v1.8.0 tag).
At RTTs of the 250ms, throughput has improved over the previous experiment results, but still doesn't reach the level of msquic (~110Mbit/s vs ~175Mbit/s), even if the s2n-quic-qns
limits are set to very much overblown values (2000 Mbit/s, 2500ms RTT for a 200Mbit/s, 250ms RTT link).
The queue limits are only reached in the msquic case, so I assume they don't limit the performance of s2n-quic.
The following data is from a run run with the before mentioned overblown limit values, 200mbit, limit 1000
link towards receiver, 200ms, limit 10000
link back towards sender.
Graphs s2n-quic-qns
Graphs secnetperf (msquic)
Output s2n-quic-qns
Sender
Command
s2n-quic/target/release/s2n-quic-qns perf client --ip 10.0.2.1 --port 1337 --ca ~/certs/echo.test.crt -server-name echo.test --send 10000000000 --stats --max-throughput 2000 --expected-rtt 2500
Output
650.68Kbps 0bps 52.42KB 51.52KB 0 29 421.818ms 333ms 333ms 0
11.83Mbps 0bps 813.44KB 812.54KB 0 275 272.407ms 277.402ms 267.225051ms 0
92.70Mbps 0bps 7.62MB 3.81MB 0 1621 266.747ms 276.2ms 257.789101ms 0
108.29Mbps 0bps 7.62MB 3.75MB 0 2131 276.886ms 264.502ms 259.277586ms 0
108.30Mbps 0bps 7.62MB 3.75MB 0 2116 9.803ms 267.049ms 259.330847ms 0
108.28Mbps 0bps 7.62MB 3.75MB 0 2106 9.951ms 267.096ms 259.545684ms 0
108.31Mbps 0bps 7.62MB 3.75MB 0 2116 9.899ms 266.884ms 259.382665ms 0
108.29Mbps 0bps 7.62MB 3.75MB 0 2116 9.906ms 266.977ms 259.621258ms 0
108.29Mbps 0bps 7.62MB 3.75MB 0 2106 12.342ms 266.925ms 259.528944ms 0
108.30Mbps 0bps 7.62MB 3.75MB 0 2109 15.422ms 268.166ms 259.648085ms 0
108.37Mbps 0bps 7.62MB 3.75MB 0 2110 18.966ms 269.139ms 259.776501ms 0
107.32Mbps 0bps 7.62MB 3.75MB 0 2123 1.109ms 269.217ms 259.85786ms 0
106.26Mbps 0bps 7.62MB 3.75MB 0 2111 269.576ms 262.501ms 258.922982ms 0
108.29Mbps 0bps 7.62MB 3.75MB 0 2126 269.057ms 263.521ms 259.022261ms 0
108.28Mbps 0bps 7.62MB 3.75MB 0 2121 268.686ms 264.604ms 259.219563ms 0
108.30Mbps 0bps 7.62MB 3.75MB 0 2113 268.361ms 266.632ms 259.380142ms 0
108.31Mbps 0bps 7.62MB 3.75MB 0 2129 279.952ms 270.326ms 259.98938ms 0
108.30Mbps 0bps 7.62MB 3.75MB 0 2123 268.996ms 266.515ms 259.4332ms 0
108.30Mbps 0bps 7.62MB 3.75MB 0 2124 269.82ms 261.338ms 258.978134ms 0
108.29Mbps 0bps 7.62MB 3.75MB 0 2120 12.822ms 261.18ms 258.927295ms 0
108.30Mbps 0bps 7.62MB 3.75MB 0 2133 12.75ms 261.214ms 258.968515ms 0
108.29Mbps 0bps 7.62MB 3.75MB 0 2119 12.869ms 262.213ms 258.963896ms 0
108.27Mbps 0bps 7.62MB 3.75MB 0 2122 12.818ms 262.543ms 259.049482ms 0
108.29Mbps 0bps 7.62MB 3.75MB 0 2122 17.127ms 262.563ms 259.024144ms 0
108.36Mbps 0bps 7.62MB 3.75MB 0 2113 20.163ms 262.647ms 259.205205ms 0
108.29Mbps 0bps 7.62MB 3.75MB 0 2120 22.55ms 263.646ms 259.104162ms 0
108.29Mbps 0bps 7.62MB 3.75MB 0 2112 22.495ms 264.64ms 259.178687ms 0
107.40Mbps 0bps 7.62MB 3.75MB 0 2119 1.095ms 264.672ms 259.174221ms 0
107.06Mbps 0bps 7.62MB 3.75MB 0 2128 1.006ms 266.127ms 259.583296ms 0
107.37Mbps 0bps 7.62MB 3.75MB 0 2123 7.849ms 265.966ms 259.546613ms 0
108.30Mbps 0bps 7.62MB 3.75MB 0 2131 9.869ms 267.181ms 259.454615ms 0
108.28Mbps 0bps 7.62MB 3.75MB 0 2127 12.688ms 267.013ms 259.572776ms 0
108.29Mbps 0bps 7.62MB 3.75MB 0 2132 15.665ms 268.079ms 259.656536ms 0
107.90Mbps 0bps 7.62MB 3.75MB 0 2107 1.11ms 268.666ms 259.687545ms 0
106.81Mbps 0bps 7.62MB 3.75MB 0 2113 1.167ms 269.025ms 259.686919ms 0
107.14Mbps 0bps 7.62MB 3.75MB 0 2129 282.42ms 266.583ms 259.29443ms 0
108.29Mbps 0bps 7.62MB 3.75MB 0 2128 2.532ms 267.186ms 259.496406ms 0
108.30Mbps 0bps 7.62MB 3.75MB 0 2103 5.763ms 268.062ms 259.463661ms 0
108.30Mbps 0bps 7.62MB 3.75MB 0 2127 9.227ms 268.157ms 259.609704ms 0
108.27Mbps 0bps 7.62MB 3.75MB 0 2130 11.94ms 268.137ms 259.673189ms 0
107.05Mbps 0bps 7.62MB 3.75MB 0 2112 1.153ms 273.378ms 260.297187ms 0
106.52Mbps 0bps 7.62MB 3.75MB 0 2118 18.318ms 273.446ms 260.370063ms 0
108.38Mbps 0bps 7.62MB 3.75MB 0 2108 21.338ms 273.513ms 260.185845ms 0
108.28Mbps 0bps 7.62MB 3.75MB 0 2132 24.981ms 274.33ms 260.444256ms 0
108.29Mbps 0bps 7.62MB 3.75MB 0 2127 3.099ms 273.826ms 260.529012ms 0
107.03Mbps 0bps 7.62MB 3.75MB 0 2123 1.067ms 273.888ms 260.357198ms 0
106.53Mbps 0bps 7.62MB 3.75MB 0 2120 9.379ms 273.935ms 260.410432ms 0
108.29Mbps 0bps 7.62MB 3.75MB 0 2122 12.367ms 273.335ms 260.414502ms 0
108.28Mbps 0bps 7.62MB 3.75MB 0 2129 15.96ms 274.824ms 260.59376ms 0
108.30Mbps 0bps 7.62MB 3.75MB 0 2109 19.031ms 274.796ms 260.606268ms 0
107.27Mbps 0bps 7.62MB 3.75MB 0 2125 1.097ms 273.373ms 260.357615ms 0
107.04Mbps 0bps 7.62MB 3.75MB 0 2128 1.123ms 273.837ms 260.445559ms 0
107.53Mbps 0bps 7.62MB 3.75MB 0 2140 282.303ms 273.825ms 260.38255ms 0
108.28Mbps 0bps 7.62MB 3.75MB 0 2134 282.731ms 273.353ms 260.418966ms 0
108.30Mbps 0bps 7.62MB 3.75MB 0 2129 283.849ms 274.32ms 260.434611ms 0
108.29Mbps 0bps 7.62MB 3.75MB 0 2129 4.036ms 274.869ms 260.575592ms 0
108.29Mbps 0bps 7.62MB 3.75MB 0 2131 7.095ms 273.382ms 260.40819ms 0
108.28Mbps 0bps 7.62MB 3.75MB 0 2119 10.761ms 273.357ms 260.336991ms 0
107.86Mbps 0bps 7.62MB 3.75MB 0 2129 1.109ms 273.336ms 260.373043ms 0
106.92Mbps 0bps 7.62MB 3.75MB 0 2109 1.09ms 273.832ms 260.493883ms 0
Receiver
Command
s2n-quic/target/release/s2n-quic-qns perf server --port 1337 --certificate ~/certs/echo.test.crt --private-key ~/certs/echo.test.key --stats --max-throughput 2000 --expected-rtt 2500
Output
Server listening on port 1337
0bps 0bps 0B 0B 0 0 0ns 0ns 0ns 0
0bps 0bps 0B 0B 0 1 0ns 0ns 0ns 0
0bps 0bps 0B 0B 0 0 0ns 0ns 0ns 0
0bps 650.63Kbps 14.52KB 2.35KB 0 96 24.906ms 333ms 333ms 0
0bps 10.14Mbps 14.52KB 2.70KB 0 1544 14.041ms 254.675ms 251.738763ms 0
0bps 85.98Mbps 28.25KB 14.15KB 0 14958 12.933ms 268.7ms 255.164378ms 0
0bps 107.79Mbps 28.25KB 14.04KB 0 18889 15.969ms 260.641ms 255.222235ms 0
0bps 108.18Mbps 28.25KB 14.07KB 0 18947 19.78ms 262.614ms 255.440541ms 0
0bps 107.91Mbps 28.25KB 14.02KB 0 18918 22.185ms 262.698ms 255.31445ms 0
0bps 107.78Mbps 28.25KB 14.02KB 0 17885 1.358ms 261.591ms 255.246114ms 0
0bps 107.76Mbps 28.25KB 13.97KB 0 18767 4.661ms 261.685ms 255.381599ms 0
0bps 107.82Mbps 28.25KB 14.02KB 0 18876 7.757ms 262.617ms 255.416375ms 0
0bps 107.83Mbps 28.30KB 14.12KB 0 17931 11.072ms 261.622ms 255.508324ms 0
0bps 107.76Mbps 28.30KB 14.12KB 0 18801 14.539ms 261.752ms 255.270263ms 0
0bps 108.28Mbps 28.30KB 14.03KB 0 19009 18.879ms 260.568ms 255.292108ms 0
0bps 108.29Mbps 28.30KB 14.02KB 0 19002 20.459ms 260.46ms 254.957184ms 0
0bps 107.89Mbps 28.30KB 14.07KB 0 18910 23.096ms 260.694ms 255.320609ms 0
0bps 107.78Mbps 28.30KB 14.07KB 0 18897 2.223ms 260.509ms 255.273123ms 0
0bps 107.78Mbps 28.30KB 14.02KB 0 18012 5.393ms 260.535ms 255.303699ms 0
0bps 107.77Mbps 28.30KB 14.02KB 0 17925 8.719ms 260.55ms 254.949658ms 0
0bps 107.74Mbps 28.30KB 14.02KB 0 17920 12.057ms 263.569ms 255.566237ms 0
0bps 107.82Mbps 28.30KB 14.02KB 0 18471 19.704ms 263.562ms 255.355214ms 0
0bps 108.30Mbps 28.30KB 13.97KB 0 19015 19.823ms 263.656ms 255.364806ms 0
0bps 108.07Mbps 28.30KB 14.08KB 0 18981 10.313ms 263.73ms 255.576281ms 0
0bps 107.72Mbps 28.30KB 13.97KB 0 18929 13.241ms 263.796ms 255.461171ms 0
0bps 107.72Mbps 28.30KB 14.02KB 0 18938 16.328ms 260.792ms 254.991823ms 0
0bps 107.70Mbps 28.30KB 13.97KB 0 18551 19.365ms 260.607ms 255.097139ms 0
0bps 107.67Mbps 28.30KB 14.07KB 0 17969 23.036ms 260.514ms 255.101772ms 0
0bps 107.89Mbps 28.30KB 14.02KB 0 18964 287.217ms 259.99ms 254.712051ms 0
0bps 108.28Mbps 28.30KB 13.91KB 0 19036 284.647ms 260.454ms 254.938803ms 0
0bps 108.29Mbps 28.30KB 13.97KB 0 19033 284.423ms 260.782ms 254.934796ms 0
0bps 108.28Mbps 28.30KB 13.91KB 0 19035 286.663ms 260.399ms 254.859352ms 0
0bps 108.29Mbps 28.30KB 13.92KB 0 19030 1.271ms 260.605ms 254.797204ms 0
0bps 107.69Mbps 28.30KB 13.97KB 0 18071 4.373ms 260.456ms 255.081798ms 0
0bps 107.70Mbps 28.30KB 13.91KB 0 18909 7.575ms 260.458ms 255.018086ms 0
0bps 107.69Mbps 28.30KB 13.91KB 0 17936 10.727ms 260.68ms 255.015946ms 0
0bps 107.69Mbps 28.30KB 13.91KB 0 18717 13.888ms 260.854ms 254.968045ms 0
0bps 107.69Mbps 28.30KB 13.97KB 0 18897 17.076ms 260.655ms 254.749747ms 0
0bps 107.70Mbps 28.30KB 13.91KB 0 18905 20.169ms 260.489ms 254.886209ms 0
0bps 107.72Mbps 28.30KB 13.91KB 0 18927 23.162ms 260.648ms 255.179399ms 0
0bps 107.70Mbps 28.30KB 13.86KB 0 17964 2.168ms 260.717ms 255.189869ms 0
0bps 107.69Mbps 28.30KB 13.91KB 0 17967 5.52ms 260.996ms 254.940345ms 0
0bps 107.69Mbps 28.30KB 13.91KB 0 18150 8.706ms 260.409ms 254.920119ms 0
0bps 107.69Mbps 28.30KB 13.91KB 0 18921 11.849ms 260.558ms 254.823291ms 0
0bps 107.69Mbps 28.30KB 13.86KB 0 18476 14.946ms 260.478ms 254.940926ms 0
0bps 107.70Mbps 28.30KB 14.07KB 0 18460 18.594ms 258.445ms 254.533498ms 0
0bps 107.72Mbps 28.30KB 13.97KB 0 18447 21.623ms 260.524ms 254.971548ms 0
0bps 107.68Mbps 28.30KB 13.97KB 0 18479 24.797ms 260.551ms 255.183226ms 0
0bps 108.23Mbps 28.30KB 13.91KB 0 18601 278.595ms 260.515ms 254.955696ms 0
0bps 108.29Mbps 28.30KB 13.97KB 0 18620 281.691ms 260.455ms 255.06452ms 0
0bps 108.29Mbps 28.30KB 13.97KB 0 18619 276.68ms 258.319ms 254.771268ms 0
0bps 108.29Mbps 28.30KB 13.97KB 0 18604 280.138ms 260.58ms 254.984038ms 0
0bps 108.28Mbps 28.30KB 13.97KB 0 18597 278.196ms 260.463ms 254.905692ms 0
0bps 108.28Mbps 28.30KB 13.91KB 0 18632 278.822ms 258.309ms 254.543648ms 0
0bps 108.29Mbps 28.30KB 13.97KB 0 18646 276.906ms 258.244ms 254.573324ms 0
0bps 108.29Mbps 28.30KB 13.91KB 0 18112 281.774ms 260.516ms 255.507789ms 0
0bps 108.29Mbps 28.30KB 13.97KB 0 18033 279.908ms 260.491ms 255.234825ms 0
0bps 108.28Mbps 28.30KB 13.91KB 0 18022 278.364ms 260.446ms 254.973469ms 0
0bps 108.29Mbps 28.30KB 13.97KB 0 18033 279.689ms 260.58ms 254.938479ms 0
0bps 108.29Mbps 28.30KB 13.91KB 0 18482 275.593ms 260.446ms 255.049725ms 0
0bps 108.29Mbps 28.30KB 13.97KB 0 19005 278.427ms 260.454ms 254.907345ms 0
0bps 108.29Mbps 28.30KB 13.91KB 0 19022 278.565ms 260.353ms 255.011382ms 0
0bps 108.28Mbps 28.30KB 13.97KB 0 19011 9.133ms 260.523ms 255.004752ms 0
Output secnetperf (msquic)
Sender
Command
msquic/artifacts/bin/linux/x64_Release_openssl/secnetperf -test:tput -maxruntime:60000 -target:10.0.2.1 -upload:100000000000
Output
Started!
Result: 1318912000 bytes @ 175854 kbps (60000.217 ms).
Warning: Did not complete all bytes (sent: 1335951360, completed: 1318912000).
App Main returning status 0
Receiver
Command
msquic/artifacts/bin/linux/x64_Release_openssl/secnetperf -bind:10.0.2.1:4433
Output
Started!
It looks like the limits weren't being applied to the remote value in the qns configuration. It should be a lot better after #1447 is merged.
#1447 has been merged, let us know how things look!