LibreQoS icon indicating copy to clipboard operation
LibreQoS copied to clipboard

ack-filtering and diffserv

Open dtaht opened this issue 2 years ago • 5 comments

We've done a limited deployment of ack-filter in both directions, at a cost of about 3% CPU. It's difficult to observe an actual difference in network behavior by doing this, but overall packet drop rates increased to nearly 1%, and that doesn't count what the CPE was already dropping. If anyone has any suggestions as to how to evaluate network behavior with ack-filter on or off, let me know?

Tracking meaningful drops however, then becomes needed as ack_drops and congestive drops and marks are pretty different. We ended up using (in openwrt) an inverse log scale to show these, as ack_drops is easily 2 orders of magnitude bigger than drops, and marks (currently) 2 orders of magnitude less than drops. I am quite surprised at how much RFC3168 marking is going on.

Historically I thought the ack-drops would only be useful on the uplink of a highly asymmetric connection: https://blog.cerowrt.org/post/ack_filtering/ - but after working out how htb + cake does things, well, we can easily accrue enough duplicate packets in both directions to be worthy of dropping some, and for all I know that helps a bit with packet-limited fifos elsewhere.

In this example: There's also a surprising number of packets marked for the voice class. (certainly observing backlog and drops here would be good, and I would worry, if marks and/or ack_drops started showing up there). I note that for a wifi wireless isp in general I recommend putting everything into the best effort class (wmm on but not paying attention to diffserv markings), in the home, depends on the gear whether to respect it or not. Anyone using diffserv effectively?

qdisc cake c7ee: parent 3:13d7 bandwidth unlimited diffserv4 triple-isolate nonat nowash ack-filter split-gso rtt 100ms raw overhead 0 
 Sent 717984865 bytes 3542201 pkt (dropped 102329, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
 memory used: 292608b of 15140Kb
 capacity estimate: 0bit
 min/max network layer size:           60 /    1494
 min/max overhead-adjusted size:       60 /    1494
 average network hdr offset:           14

                   Bulk  Best Effort        Video        Voice
  thresh           0bit         0bit         0bit         0bit
  target            5ms          5ms          5ms          5ms
  interval        100ms        100ms        100ms        100ms
  pk_delay        109us          8us          1us          9us
  av_delay          1us          0us          0us          1us
  sp_delay          0us          0us          0us          0us
  backlog            0b           0b           0b           0b
  pkts               62      3642202           36         2230
  bytes            3720    726930243         2312       378530
  way_inds            0       105838            0            0
  way_miss           56        25340           33          244
  way_cols            0            0            0            0
  drops               0         1656            0            0
  marks               0           17            0            0
  ack_drop            0       100673            0            0
  sp_flows            0            1            0            1
  bk_flows            0            1            0            0
  un_flows            0            0            0            0
  max_len            60         1494           98          590
  quantum          1514         1514         1514         1514

dtaht avatar Sep 04 '22 14:09 dtaht

With cake diffserv_4 we have had a great experience, particularly for gaming and conference calls. These stats below are from a high usage client from 4AM-9AM in the morning. I'll try to get some stats for after a whole day of usage.

qdisc cake b5a3: parent 1:b bandwidth unlimited diffserv4 triple-isolate nonat nowash no-ack-filter split-gso rtt 100ms raw overhead 0 
 Sent 21704637347 bytes 14945635 pkt (dropped 102, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
 memory used: 4813060b of 15140Kb
 capacity estimate: 0bit
 min/max network layer size:           60 /    1514
 min/max overhead-adjusted size:       60 /    1514
 average network hdr offset:           14

                   Bulk  Best Effort        Video        Voice
  thresh           0bit         0bit         0bit         0bit
  target            5ms          5ms          5ms          5ms
  interval        100ms        100ms        100ms        100ms
  pk_delay         17us         39us         10us         24us
  av_delay          0us          2us          1us          0us
  sp_delay          0us          0us          0us          0us
  backlog            0b           0b           0b           0b
  pkts                6     14890201        55416          114
  bytes             390  21694203280     10569929        14260
  way_inds            0      2249944            0            0
  way_miss            6        32395          368           18
  way_cols            0            0            0            0
  drops               0          102            0            0
  marks               0            1            0            0
  ack_drop            0            0            0            0
  sp_flows            1            4            0            1
  bk_flows            0            0            0            0
  un_flows            0            0            0            0
  max_len            90         1514         1514          590
  quantum          1514         1514         1514         1514

With regard to ack-filtering, we don't use it currently (i feared it would mess with gaming users) but I'm open to enabling it to test it out. Our DL/UL ratio is usually 5:1 such as for 100/20Mbps plans so I'm not sure if ack-filtering's benefits would present in our case or not.

rchac avatar Sep 04 '22 14:09 rchac

As you can see, diffserv is not being used all that much between those hours. :) I am mostly looking at stats at the peak hours of 9 AM and 8PM, and also have flent servers driving periodic tests from a remote site over the network, experimenting with BBR in particular.

dtaht avatar Sep 04 '22 15:09 dtaht

The idea I have long term for ack-filtering is to hook that call to do something that is pping-like, tracking actual RTTs of real traffic. It does all the needed decoding in the qdisc, and instead of sniffing the whole network for it (pping), and/or doing it in kernel (ebpf pping), doing it in pure C as a cake option (ack-monitor) would be a useful diagnostic tool that could be applied or not to individual customers.

But first up was seeing how much cpu it ate, and if it did any harm or had any benefit.

dtaht avatar Sep 04 '22 15:09 dtaht

ack-filtering should not mess with gaming users, at least not those with FPS shooters which generally run over udp. The side-effects of ack-filtering are providing slightly less robust acks to a TCP on the other side, sending traffic, which has to smooth the ack arrivals against it's estimate of the actual bandwidth.

BBR on the other hand, doesn't care all that much about acks. It responds to delay pretty well, which is what FQ will automagically give it.

So I would not be worried about the effects on gaming users. My other major concern with enabling ack-filtering is that it requires inspecting the packet fairly deeply, and I worry about gear (like mikrotik) that doesn't give good statistics as to how successfully it can reach into various encapsulations.

dtaht avatar Sep 04 '22 15:09 dtaht

The idea I have long term for ack-filtering is to hook that call to do something that is pping-like, tracking actual RTTs of real traffic. It does all the needed decoding in the qdisc, and instead of sniffing the whole network for it (pping), and/or doing it in kernel (ebpf pping), doing it in pure C as a cake option (ack-monitor) would be a useful diagnostic tool that could be applied or not to individual customers.

But first up was seeing how much cpu it ate, and if it did any harm or had any benefit.

3% seems very reasonable for the benefit of being able to eventually track RTTs. And way easier to integrate.

So I would not be worried about the effects on gaming users. My other major concern with enabling ack-filtering is that it requires inspecting the packet fairly deeply, and I worry about gear (like mikrotik) that doesn't give good statistics as to how successfully it can reach into various encapsulations.

Gotcha. I will try it out on my deployment.

rchac avatar Sep 04 '22 16:09 rchac

We seem to have an outstanding bug with cambium #154 but aside from preserving this discussion, closing this.

dtaht avatar Nov 13 '22 17:11 dtaht