neqo
neqo copied to clipboard
Better algorithm for stream flow control
What we have now is the thing we did that was fast to implement, rather than a fully thought-out implementation. We should improve our algorithm, when possible.
from https://github.com/mozilla/neqo/pull/726#discussion_r440584070:
We have an estimate of BDP we might use to tune this. We should aim for BDP*2 at a minimum if we have an algorithm like this, otherwise we will need more frequent sending of updates.
Status quo neqo
Static connection flow-control limit:
https://github.com/mozilla/neqo/blob/4fc4d16744ede5b099dd5bb4bfd5922d07deb046/neqo-transport/src/connection/params.rs#L21
Static stream receive window limit:
https://github.com/mozilla/neqo/blob/c004359a817ffdea33394e94944d2f882e7e78af/neqo-transport/src/recv_stream.rs#L33
Static stream send window limit:
https://github.com/mozilla/neqo/blob/c004359a817ffdea33394e94944d2f882e7e78af/neqo-transport/src/send_stream.rs#L36
Other implementations
- Chromium:
- limits: https://github.com/chromium/chromium/blob/904ed34564cd5482d88cfc96abf7187a1455fa16/net/quic/quic_context.cc#L20-L22
- Google Quiche:
- auto-tuning: https://github.com/google/quiche/blob/64cdc52a285d82dfd8756e0e30d15c2d08df6081/quiche/quic/core/quic_flow_controller.cc#L122-L186
- limits: https://github.com/google/quiche/blob/792c6e7d8658d7b7b27f780efec71c67930b4c9f/quiche/quic/core/quic_constants.h#L82-L86
- quic-go:
- auto-tuning: https://github.com/quic-go/quic-go/blob/e48e1d465dffafc1cfd55dc6a4e4642087dcd5eb/internal/flowcontrol/base_flow_controller.go#L95-L118
- limits: https://github.com/quic-go/quic-go/blob/b096e94092e97b998e396c5c67735b97e403edd3/interface.go#L291-L294
Potential solution
Implement auto-tuning based on Google's QUIC design document:
Algorithm
- As above, the flow control window update triggered when: available window < max window size / 2, where available window = max receive window offset - bytes consumed
- to realize auto-tuning, add the following logic just before issuing a window update
- keep track of time interval between subsequent window updates, call this since_last_update.
- if ( since_last_update < RTT * trigger factor ) then max window size = MIN(max window size * increase factor, upper bound).
- trigger factor is 2
- increase factor is 2
- As above, window update sets:
max_received window offset += (max window size - available window)
https://docs.google.com/document/d/1F2YfdDXKpy20WVKJueEf4abn_LVZHhMUMS5gX6Pgjl4/edit#heading=h.j3nq4bn2eaud
Used in quic-go.
Also I implemented this for a muxer on top of TCP in the past https://github.com/libp2p/rust-yamux/pull/176.
Alternatives
More research needed. Pointers very much appreciated.
I believe that a better approach would be not to double the window, but to increase the window by an amount equal to the amount consumed since the last increase if the time taken to consume the windows is less than one RTT (increased by a small fudge factor to allow slack for reporting delays). This is similar to the increase in Reno CC. The effect then would be that the window increases less rapidly and maybe more often, but it would end up with a commitment that is closer to a 2x factor of BDP, rather than being 4x or more.
I believe that a better approach would be not to double the window, but to increase the window by an amount equal to the amount consumed since the last increase if the time taken to consume the windows is less than one RTT
Given that the premise ("if the time taken to consume the windows is less than one RTT") requires the window to be fully consumed, the "amount consumed" will be equal to the window, thereby the increase will be equal to the window, and thus the new window will be twice as large as the current window. How is that different to doubling the window as proposed above?
I might be confusing (a) the increase of the available window and (b) the increase of the maximum window size above.
@martinthomson would you mind rephrasing your proposal above, with the terminology used in fc.rs
. For what it is worth, here is the implementation of the Google QUIC algorithm:
https://github.com/mozilla/neqo/blob/0ad1b77f9483568570c99a04fbdb4329ff6f50c9/neqo-transport/src/fc.rs#L393-L405
Sure. The goal is to increase the value of self.max_active
based on perceived changes in the BDP, as observed by the throughput on the affected stream.
Therefore, I suggest that if the rate at which self.retired
increases (that is, the change in that value, divided by the time elapsed) exceeds some function of self.max_active / path.rtt
, then we can increase self.max_active
by the amount that self.retired
has increased.
This will approximately lead to a doubling as self.max_active
determines how much data can be outstanding, but that doubling might happen on a shorter timescale than an RTT, meaning that rate increases could exceed one doubling per RTT. The same is true of what you proposed as well, just that this version will cap out at a lower value for self.max_active
.
Note that we could increase only based on the amount by which the increase exceeds previous expectations, which would lead to a closer adaptation, but would be more reliant on ACK rate than the simple scheme.