neqo Implement RACK variation for QUIC

Highly WIP: Early Draft, already working, but not necessary final version. Not necessary to read through yet, only if you are interested. Submitting already, because I probably only continue next week.

This is a modified version of RACK for QUIC. The high level is working, but the algorithm differs a bit from the RFC due to more information available in QUIC than TCP.

What it does:

when packet reordering is detected: adjust the reo_wnd_mult to high enough to not detect the same amount of packet reordering as packet loss next time
after a loss was detected reduce when entering CongestionAvoidance reduce reo_wnd_persist by one. and set reo_wnd_mult to the next non-zero entry of reo_wnd_mult.

I am still considering to make the implementation "dumper" and maybe simpler/more readable by following the RFC more closely.

reo_wnd_mult += 1 should only be called once per RTT -> need state for that

Still TODO:

proper variables names (not those from the RFC)
not sure if returning bool from on_packets_acked is fine
debug prints instead of println for print statements that should stay

Nov 02 '23 18:11 mb

Differences to TCP RACK (of next uploaded version):

initial timeout at 9/8 RTT instead of 5/4 RTT to keep the
reorder_window_mult adds fractions of 1/8 RTT instead of 1/4 RTT for no particular reason (except that it make the code more consistent to the initial timeout of 9/8 RTT
out of order packets, that weren't causing spurious retransmits still reset the reorder_window_persist to 16. TCP doesn't have the necessary information to do this
reorder_window_mult is set high enough to prevent a spurious retransmit next time instead of just increasing by one. Can be done, because we have more RTT estimates for packets, that would have been spuriously retransmitted in TCP.

Nov 14 '23 16:11 mb

Tagging @larseggert for thoughts. (Looks like there is a rebase to manage here.)

Nov 20 '23 23:11 martinthomson

RACK and/or other CC improvements are attractive areas of work, but I think we first need to be sure that the underlying machinery (timers, buffers, loss recovery logic, etc.) is working as intended, and that is probably easiest with a standard RFC9002 implementation. We also need a robust performance testing setup, because otherwise we won't really be able to quantify the improvements that proposed changes like this one will bring.

In other words, I think we should put this and similar improvements on the back burner and first get the plumbing fixed and performance scaffolding done.

Nov 21 '23 07:11 larseggert

neqo neqo copied to clipboard

Implement RACK variation for QUIC

neqo
neqo copied to clipboard