flamethrower icon indicating copy to clipboard operation
flamethrower copied to clipboard

Flamethrower 0.11.0 sometimes fails to send TCP queries on FreeBSD to BIND 9.11

Open Mno-hime opened this issue 3 years ago • 1 comments

Flamethrower 0.11.0 sometimes fails to send TCP queries on FreeBSD 12.2 to BIND 9.11. This started for us with Flamethrower 0.11.0, version 0.10 from FreeBSD ports was fine, BIND 9.16 and 9.17 are fine as query targets, also it does not happen on Linux. Here is the original issue in ISC GitLab. The culprit seems to be a7b83e428575004bdac1b8ea431e040414a280fc (identified with git bisect) and with this change made to flame/tokenbucket.h the problem went away.:

@@ -25,6 +25,7 @@ public:
     {
         if (_token_wallet < tokens) {
             if (_last_fill_ms.count() == 0) {
+                _token_wallet = _rate_qps;
                 _last_fill_ms = now_ms;
             } else if (now_ms > _last_fill_ms) {
                 auto elapsed_ms = (now_ms - _last_fill_ms).count();

Reproducer

Start named from any recent BIND 9.11 version: named -f -c named.conf.

Start Flamethrower instances:

/usr/local/bin/flame --dnssec -P udp -F inet -Q 10000 -p 5300 -v 99 10.53.0.3 > flame.udp.4 &
/usr/local/bin/flame --dnssec -P udp -F inet6 -Q 10000 -p 5300 -v 99 [fd92:7065:b8e:ffff::3] > flame.udp.6 &
/usr/local/bin/flame --dnssec -P tcp -F inet -Q 10000 -p 5300 -v 99 10.53.0.3 > flame.tcp.4 &
/usr/local/bin/flame --dnssec -P tcp -F inet6 -Q 10000 -p 5300 -v 99 [fd92:7065:b8e:ffff::3]  > flame.tcp.6 &

After some time kill all Flamethrower instances with killall flame.

Grep for total queries sent and received in output files (it's not always zero queries sent but one in five TCP instances fails like this and won't recover):

$ grep ^total flame.*.*
flame.tcp.4:total sent  : 0
flame.tcp.4:total rcvd  : 0
flame.tcp.6:total sent  : 0
flame.tcp.6:total rcvd  : 0
flame.udp.4:total sent  : 80820
flame.udp.4:total rcvd  : 80803
flame.udp.6:total sent  : 80820
flame.udp.6:total rcvd  : 80777

flame.tcp.4:

--class: "IN"
--dnssec: true
--help: false
--qps-flow: null
--targets: null
--version: false
-F: "inet"
-M: "GET"
-P: "tcp"
-Q: "10000"
-R: false
-T: "A"
-b: null
-c: "10"
-d: "1"
-f: null
-g: "static"
-l: "0"
-n: "0"
-o: null
-p: "5300"
-q: "10"
-r: "test.com"
-t: "3"
-v: "99"
GENOPTS: []
TARGET: "10.53.0.3"
binding to 0.0.0.0
flaming target(s) [10.53.0.3] on port 5300 with 30 concurrent generators, each sending 100 queries every 1000ms on protocol tcp
query generator [static] contains 1 record(s)
rate limit @ 10000 QPS (333.333 QPS per concurrent sender)
0.919358s: send: 0, avg send: 0, recv: 0, avg recv: 0, min/avg/max resp: 0/nan/0ms, in flight: 0, timeouts: 0
1.92136s: send: 0, avg send: 0, recv: 0, avg recv: 0, min/avg/max resp: 0/nan/0ms, in flight: 0, timeouts: 0
2.92128s: send: 0, avg send: 0, recv: 0, avg recv: 0, min/avg/max resp: 0/nan/0ms, in flight: 0, timeouts: 0
3.93132s: send: 0, avg send: 0, recv: 0, avg recv: 0, min/avg/max resp: 0/nan/0ms, in flight: 0, timeouts: 0
4.94233s: send: 0, avg send: 0, recv: 0, avg recv: 0, min/avg/max resp: 0/nan/0ms, in flight: 0, timeouts: 0
5.95264s: send: 0, avg send: 0, recv: 0, avg recv: 0, min/avg/max resp: 0/nan/0ms, in flight: 0, timeouts: 0
6.96256s: send: 0, avg send: 0, recv: 0, avg recv: 0, min/avg/max resp: 0/nan/0ms, in flight: 0, timeouts: 0
7.97184s: send: 0, avg send: 0, recv: 0, avg recv: 0, min/avg/max resp: 0/nan/0ms, in flight: 0, timeouts: 0
8.01863s: send: 0, avg send: 0, recv: 0, avg recv: 0, min/avg/max resp: 0/nan/0ms, in flight: 0, timeouts: 0

------
run id      : 28ebb537ed6abf2b
run start   : 2021-07-20T11:24:44Z
runtime     : 8.02064 s
total sent  : 0
total rcvd  : 0
min resp    : 0 ms
avg resp    : nan ms
max resp    : 0 ms
avg r qps   : 0
avg s qps   : 0
avg pkt     : 0 bytes
tcp conn.   : 45
timeouts    : 0 (nan%) 
bad recv    : 0
net errors  : 0

named configuration files

named.conf:

options {
    listen-on { 10.53.0.3; };
    listen-on-v6 port 5300 { fd92:7065:b8e:ffff::3; };
    port 5300;
    directory "/home/newman/output/ns3";
    allow-recursion { any; };
    query-source address 10.53.0.3;
    pid-file "named.pid";
    recursion yes;
    tcp-clients 50;
    statistics-file "named.stats";
};

view "default" {
    zone "." {
        type hint;
        file "root.hint";
    };
};

root.hint:

$TTL 999999
.                        IN NS  a.root-servers.nil.
a.root-servers.nil.      IN A   10.53.0.1

Mno-hime avatar Jul 20 '21 13:07 Mno-hime

@Mno-hime Thanks for the detailed report! We'll take a look.

weyrick avatar Jul 20 '21 13:07 weyrick

BIND 9.11 has been EoL for some time, and this is not an active issue for us anymore. I don't think this issue needs to be kept open.

Mno-hime avatar Mar 02 '23 09:03 Mno-hime