LibreQoS
LibreQoS copied to clipboard
ack-filtering and diffserv
We've done a limited deployment of ack-filter in both directions, at a cost of about 3% CPU. It's difficult to observe an actual difference in network behavior by doing this, but overall packet drop rates increased to nearly 1%, and that doesn't count what the CPE was already dropping. If anyone has any suggestions as to how to evaluate network behavior with ack-filter on or off, let me know?
Tracking meaningful drops however, then becomes needed as ack_drops and congestive drops and marks are pretty different. We ended up using (in openwrt) an inverse log scale to show these, as ack_drops is easily 2 orders of magnitude bigger than drops, and marks (currently) 2 orders of magnitude less than drops. I am quite surprised at how much RFC3168 marking is going on.
Historically I thought the ack-drops would only be useful on the uplink of a highly asymmetric connection: https://blog.cerowrt.org/post/ack_filtering/ - but after working out how htb + cake does things, well, we can easily accrue enough duplicate packets in both directions to be worthy of dropping some, and for all I know that helps a bit with packet-limited fifos elsewhere.
In this example: There's also a surprising number of packets marked for the voice class. (certainly observing backlog and drops here would be good, and I would worry, if marks and/or ack_drops started showing up there). I note that for a wifi wireless isp in general I recommend putting everything into the best effort class (wmm on but not paying attention to diffserv markings), in the home, depends on the gear whether to respect it or not. Anyone using diffserv effectively?
qdisc cake c7ee: parent 3:13d7 bandwidth unlimited diffserv4 triple-isolate nonat nowash ack-filter split-gso rtt 100ms raw overhead 0
Sent 717984865 bytes 3542201 pkt (dropped 102329, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
memory used: 292608b of 15140Kb
capacity estimate: 0bit
min/max network layer size: 60 / 1494
min/max overhead-adjusted size: 60 / 1494
average network hdr offset: 14
Bulk Best Effort Video Voice
thresh 0bit 0bit 0bit 0bit
target 5ms 5ms 5ms 5ms
interval 100ms 100ms 100ms 100ms
pk_delay 109us 8us 1us 9us
av_delay 1us 0us 0us 1us
sp_delay 0us 0us 0us 0us
backlog 0b 0b 0b 0b
pkts 62 3642202 36 2230
bytes 3720 726930243 2312 378530
way_inds 0 105838 0 0
way_miss 56 25340 33 244
way_cols 0 0 0 0
drops 0 1656 0 0
marks 0 17 0 0
ack_drop 0 100673 0 0
sp_flows 0 1 0 1
bk_flows 0 1 0 0
un_flows 0 0 0 0
max_len 60 1494 98 590
quantum 1514 1514 1514 1514
With cake diffserv_4 we have had a great experience, particularly for gaming and conference calls. These stats below are from a high usage client from 4AM-9AM in the morning. I'll try to get some stats for after a whole day of usage.
qdisc cake b5a3: parent 1:b bandwidth unlimited diffserv4 triple-isolate nonat nowash no-ack-filter split-gso rtt 100ms raw overhead 0
Sent 21704637347 bytes 14945635 pkt (dropped 102, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
memory used: 4813060b of 15140Kb
capacity estimate: 0bit
min/max network layer size: 60 / 1514
min/max overhead-adjusted size: 60 / 1514
average network hdr offset: 14
Bulk Best Effort Video Voice
thresh 0bit 0bit 0bit 0bit
target 5ms 5ms 5ms 5ms
interval 100ms 100ms 100ms 100ms
pk_delay 17us 39us 10us 24us
av_delay 0us 2us 1us 0us
sp_delay 0us 0us 0us 0us
backlog 0b 0b 0b 0b
pkts 6 14890201 55416 114
bytes 390 21694203280 10569929 14260
way_inds 0 2249944 0 0
way_miss 6 32395 368 18
way_cols 0 0 0 0
drops 0 102 0 0
marks 0 1 0 0
ack_drop 0 0 0 0
sp_flows 1 4 0 1
bk_flows 0 0 0 0
un_flows 0 0 0 0
max_len 90 1514 1514 590
quantum 1514 1514 1514 1514
With regard to ack-filtering, we don't use it currently (i feared it would mess with gaming users) but I'm open to enabling it to test it out. Our DL/UL ratio is usually 5:1 such as for 100/20Mbps plans so I'm not sure if ack-filtering's benefits would present in our case or not.
As you can see, diffserv is not being used all that much between those hours. :) I am mostly looking at stats at the peak hours of 9 AM and 8PM, and also have flent servers driving periodic tests from a remote site over the network, experimenting with BBR in particular.
The idea I have long term for ack-filtering is to hook that call to do something that is pping-like, tracking actual RTTs of real traffic. It does all the needed decoding in the qdisc, and instead of sniffing the whole network for it (pping), and/or doing it in kernel (ebpf pping), doing it in pure C as a cake option (ack-monitor) would be a useful diagnostic tool that could be applied or not to individual customers.
But first up was seeing how much cpu it ate, and if it did any harm or had any benefit.
ack-filtering should not mess with gaming users, at least not those with FPS shooters which generally run over udp. The side-effects of ack-filtering are providing slightly less robust acks to a TCP on the other side, sending traffic, which has to smooth the ack arrivals against it's estimate of the actual bandwidth.
BBR on the other hand, doesn't care all that much about acks. It responds to delay pretty well, which is what FQ will automagically give it.
So I would not be worried about the effects on gaming users. My other major concern with enabling ack-filtering is that it requires inspecting the packet fairly deeply, and I worry about gear (like mikrotik) that doesn't give good statistics as to how successfully it can reach into various encapsulations.
The idea I have long term for ack-filtering is to hook that call to do something that is pping-like, tracking actual RTTs of real traffic. It does all the needed decoding in the qdisc, and instead of sniffing the whole network for it (pping), and/or doing it in kernel (ebpf pping), doing it in pure C as a cake option (ack-monitor) would be a useful diagnostic tool that could be applied or not to individual customers.
But first up was seeing how much cpu it ate, and if it did any harm or had any benefit.
3% seems very reasonable for the benefit of being able to eventually track RTTs. And way easier to integrate.
So I would not be worried about the effects on gaming users. My other major concern with enabling ack-filtering is that it requires inspecting the packet fairly deeply, and I worry about gear (like mikrotik) that doesn't give good statistics as to how successfully it can reach into various encapsulations.
Gotcha. I will try it out on my deployment.
We seem to have an outstanding bug with cambium #154 but aside from preserving this discussion, closing this.