pdns
pdns copied to clipboard
Recursor stops responding after `rec_control reload-lua-config`
- Program: Recursor
- Issue type: Bug report
Short description
Recursor stops responding after rec_control reload-lua-config
Environment
- Operating system: SLES15-SP4
- Software version: PowerDNS Recursor 4.7.1
- Software source: PowerDNS repository, self compiled (build-log)
Steps to reproduce
mkdir -p tmp; cd tmp
> recursor.conf
echo 'dnstapFrameStreamServer( "10.20.30.40:1234" )' > lua.conf
pdns_recursor --daemon=no --config-dir=. --socket-dir=. --local-address=127.0.0.53 --lua-config-file=./lua.conf
dig +tries=1 +time=1 @127.0.0.53 localhost ## WORKS
rec_control --socket-dir=. reload-lua-config
dig +tries=1 +time=1 @127.0.0.53 localhost ## FAILS (timeout)
Expected behaviour
Recursor should never stop responding after reload-lua-config
Actual behaviour
Recursor does not work anymore after reload-lua-config
on circumstances.
Other information
To trigger the error, the FrameStreamServer IP address must not be a local address of the server itself, but a external one. It doesn't matter if the address answers or not.
There is nothing in the journal what indicates an error.
Hello @paddg, thank you for the report. I am unable to reproduce the issue. You mention that you built from source from the pdns repository, specifically recursor version 4.7.0 (I assume that means the rec-4.7.0
branch?). Could you provide the following information?
- The
configure
flags used. - The version of Lua being used.
- If possible, whether the problem also exists on
master
. - If possible, whether the problem also exists on another distribution.
Thanks for investigation, @fredmorcos! I'll try to further isolate the issue so it's hopefully easier to reproduce. Unfortunately I won't be able to do that for a few days. But then I will also answer your questions.
Hello @fredmorcos, I've updated the issue description and the steps to reproduce it. Also I've attached the build-log. I hope this helps you. I've no answer for questions 3 and 4.
If possible, whether the problem also exists on another distribution.
I tried it now on a Ubuntu 20.04 "Focal Fossa" with the 4.7.1 package from https://repo.powerdns.com/. And I was able to reproduce the issue with the described procedure from above.
Were you able to reproduce the issue as well?
Hi @paddg, could you provide the contents of recursor.conf
please? Thank you!
I'm working on reproducing this on Ubuntu 20.04.
could you provide the contents of recursor.conf please?
It is empty. See "Steps to reproduce":
> recursor.conf
I've been trying to reproduce this on Ubuntu 20.04 and on Archlinux to no avail. I tested with the master
branch and the rec-4.7.1
tag. Here are the steps I follow:
$ git clone https://github.com/PowerDNS/pdns.git
$ cd pdns/pdns/recursordist/
$ autoreconf -vi
# Set CFLAGS and CXXFLAGS here to replicate your SLES build as much as possible. And use --with-lua=lua5.3 because it otherwise prefers luajit.
$ CFLAGS='-fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g' CXXFLAGS='-fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g' ./configure --enable-nod --enable-dns-over-tls --with-lua=lua5.3
$ make -j4
$ mkdir tmp
$ cd tmp/
$ touch recursor.conf
$ echo 'dnstapFrameStreamServer( "10.20.30.40:1234" )' > lua.conf
$ sudo ../pdns_recursor --daemon=no --config-dir=. --socket-dir=. --local-address=127.0.0.53 --local-port=54 --lua-config-file=./lua.conf
Jul 26 21:03:01 PowerDNS Recursor 0.0.24187.0.master.g9c9e6a1dc3 (C) 2001-2022 PowerDNS.COM BV
Jul 26 21:03:01 Using 64-bits mode. Built using gcc 12.1.0 on Jul 26 2022 20:54:03 by fred@axon.
Jul 26 21:03:01 PowerDNS comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it according to the terms of the GPL version 2.
Jul 26 21:03:01 msg="Enabling IPv4 transport for outgoing queries" subsystem="config" level="0" prio="Notice" tid="0" ts="1658862181.668"
Jul 26 21:03:01 msg="NOT using IPv6 for outgoing queries - add an IPv6 address (like '::') to query-local-address to enable" subsystem="config" level="0" prio="Warning" tid="0" ts="1658862181.668"
Jul 26 21:03:01 msg="Setting access control" subsystem="config" level="0" prio="Info" tid="0" ts="1658862181.669" acl="allow-from" addresses="127.0.0.0/8 10.0.0.0/8 100.64.0.0/10 169.254.0.0/16 192.168.0.0/16 172.16.0.0/12 ::1/128 fc00::/7 fe80::/10"
Jul 26 21:03:01 msg="Will not send queries to" subsystem="config" level="0" prio="Notice" tid="0" ts="1658862181.672" addresses="127.0.0.0/8 10.0.0.0/8 100.64.0.0/10 169.254.0.0/16 192.168.0.0/16 172.16.0.0/12 ::1/128 fc00::/7 fe80::/10 0.0.0.0/8 192.0.0.0/24 192.0.2.0/24 198.51.100.0/24 203.0.113.0/24 240.0.0.0/4 ::/96 ::ffff:0:0/96 100::/64 2001:db8::/32 0.0.0.0 ::"
Jul 26 21:03:01 msg="PowerDNS Recursor itself will distribute queries over threads" subsystem="config" level="0" prio="Notice" tid="0" ts="1658862181.672"
Jul 26 21:03:01 msg="Inserting rfc 1918 private space zones" subsystem="config" level="0" prio="Notice" tid="0" ts="1658862181.672"
Jul 26 21:03:01 msg="Listening for queries" subsystem="config" level="0" prio="Info" tid="0" ts="1658862181.672" address="127.0.0.53:54" protocol="UDP"
Jul 26 21:03:01 msg="Enabled TCP data-ready filter for (slight) DoS protection" subsystem="config" level="0" prio="Info" tid="0" ts="1658862181.672"
Jul 26 21:03:01 msg="Listening for queries" subsystem="config" level="0" prio="Info" tid="0" ts="1658862181.672" address="127.0.0.53:54" protocol="TCP"
Jul 26 21:03:01 msg="Raised soft limit on number of filedescriptors to match max-mthreads and threads settings" subsystem="config" level="0" prio="Warning" tid="0" ts="1658862181.673" limit="4321"
Jul 26 21:03:01 msg="Launching distributor threads" subsystem="config" level="0" prio="Notice" tid="0" ts="1658862181.673" count="1"
Jul 26 21:03:01 msg="Launching worker threads" subsystem="config" level="0" prio="Notice" tid="0" ts="1658862181.673" count="2"
Jul 26 21:03:01 msg="Enabled multiplexer" subsystem="runtime" level="0" prio="Info" tid="0" ts="1658862181.674" name="epoll"
Jul 26 21:03:02 msg="Not validating response for security status update, this is a non-release version" subsystem="housekeeping" level="0" prio="Warning" tid="0" ts="1658862182.866" query="recursor-0.0.24187.0.master.g9c9e6a1dc3.security-status.secpoll.powerdns.com" version="0.0.24187.0.master.g9c9e6a1dc3"
To test:
# In a separate terminal.
$ nmap -A 127.0.0.53 -p 54
Starting Nmap 7.92 ( https://nmap.org ) at 2022-07-26 21:03 CEST
Nmap scan report for 127.0.0.53
Host is up (0.000044s latency).
PORT STATE SERVICE VERSION
54/tcp open domain PowerDNS Recursor 0.0.24187.0.master.g9c9e6a1dc3
| dns-nsid:
| NSID: axon (61786f6e)
| id.server: axon
|_ bind.version: PowerDNS Recursor 0.0.24187.0.master.g9c9e6a1dc3 (built Jul 26 2022 20:54:03 by fred@axon)
Service detection performed. Please report any incorrect results at https://nmap.org/submit/ .
Nmap done: 1 IP address (1 host up) scanned in 39.34 seconds
$ dig +tries=1 +time=1 @127.0.0.53 -p 54 localhost
; <<>> DiG 9.18.5 <<>> +tries +time @127.0.0.53 -p 54 localhost
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30071
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;localhost. IN A
;; ANSWER SECTION:
localhost. 86400 IN A 127.0.0.1
;; Query time: 0 msec
;; SERVER: 127.0.0.53#54(127.0.0.53) (UDP)
;; WHEN: Tue Jul 26 21:04:39 CEST 2022
;; MSG SIZE rcvd: 54
Reload the configuration:
$ sudo ../rec_control --socket-dir=. reload-lua-config
Reloaded Lua configuration file './lua.conf'
and I get the following in the recursor output:
Jul 26 21:04:59 msg="Received rec_control command via control socket" subsystem="control" level="0" prio="Info" tid="0" ts="1658862299.501" command="reload-lua-config"
Jul 26 21:04:59 Reloaded Lua configuration file './lua.conf', requested via control channel
Trying again:
$ dig +tries=1 +time=1 @127.0.0.53 -p 54 localhost
; <<>> DiG 9.18.5 <<>> +tries +time @127.0.0.53 -p 54 localhost
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 34987
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;localhost. IN A
;; ANSWER SECTION:
localhost. 86328 IN A 127.0.0.1
;; Query time: 0 msec
;; SERVER: 127.0.0.53#54(127.0.0.53) (UDP)
;; WHEN: Tue Jul 26 21:05:51 CEST 2022
;; MSG SIZE rcvd: 54
Build/configure summary on Ubuntu:
configure: Configuration summary
configure: =====================
configure:
configure: PowerDNS Recursor 0.0.24187.0.master.g9c9e6a1dc3 configured with: '--enable-nod' '--enable-dns-over-tls' '--with-lua=lua5.3' 'CFLAGS=-fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g' 'CXXFLAGS=-fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g'
configure:
configure: CC: gcc
configure: CXX: g++ -std=c++17
configure: LD: /usr/bin/ld -m elf_x86_64
configure: CFLAGS: -fPIE -DPIE -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 --param ssp-buffer-size=4 -fstack-protector -Wall -Wextra -Wshadow -Wno-unused-parameter -Wmissing-declarations -Wredundant-decls -fvisibility=hidden -g -O2 -fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g
configure: CPPFLAGS:
configure: CXXFLAGS: -fPIE -DPIE -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 --param ssp-buffer-size=4 -fstack-protector -Wall -Wextra -Wshadow -Wno-unused-parameter -Wmissing-declarations -Wredundant-decls -fvisibility=hidden -g -O2 -fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g
configure: LDFLAGS: -Wl,-z -Wl,relro -Wl,-z -Wl,now
configure: LIBS: -lpthread
configure: BOOST_CPPFLAGS: -pthread
configure:
configure: Features enabled
configure: ----------------
configure: Lua: lua5.3
configure: OpenSSL ECDSA: yes
configure: ed25519: yes
configure: ed448: yes
configure: Protobuf: yes
configure: SNMP: yes
configure: systemd: yes
configure: nod: yes
configure: dnstap: yes
configure: DNS over TLS: yes
configure: OpenSSL: yes
configure: libcurl: yes
configure: Context library: Boost Context
and build/configure summary on Archlinux:
configure: Configuration summary
configure: =====================
configure:
configure: PowerDNS Recursor 0.0.24187.0.master.g9c9e6a1dc3 configured with: '--enable-nod' '--enable-dns-over-tls' '--with-lua=lua5.3' 'CFLAGS=-fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g' 'CXXFLAGS=-fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g'
configure:
configure: CC: gcc
configure: CXX: g++ -std=c++17
configure: LD: /usr/bin/ld -m elf_x86_64
configure: CFLAGS: -fPIE -DPIE -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 --param ssp-buffer-size=4 -fstack-protector -Wall -Wextra -Wshadow -Wno-unused-parameter -Wmissing-declarations -Wredundant-decls -fvisibility=hidden -g -O2 -fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g
configure: CPPFLAGS:
configure: CXXFLAGS: -fPIE -DPIE -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 --param ssp-buffer-size=4 -fstack-protector -Wall -Wextra -Wshadow -Wno-unused-parameter -Wmissing-declarations -Wredundant-decls -fvisibility=hidden -g -O2 -fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g
configure: LDFLAGS: -Wl,-z -Wl,relro -Wl,-z -Wl,now
configure: LIBS:
configure: BOOST_CPPFLAGS: -pthread
configure:
configure: Features enabled
configure: ----------------
configure: Lua: lua5.3
configure: OpenSSL ECDSA: yes
configure: ed25519: yes
configure: ed448: yes
configure: Protobuf: yes
configure: SNMP: yes
configure: systemd: yes
configure: nod: yes
configure: dnstap: yes
configure: DNS over TLS: yes
configure: OpenSSL: yes
configure: libcurl: yes
configure: Context library: Boost Context
I have a few more questions:
- I don't see a
git clone
orgit checkout
command in your build log, are you cloning from Github? - Are you checking out the
rec-4.7.1
tag or building frommaster
? - What does the configure summary look like for you on Ubuntu 20.04?
- Is the Ubuntu installation up to date?
- Which version of Lua is it picking up?
- Could you test with
luajit
on SLES and Ubuntu, please? - Could you follow the steps above and let me know whether the issue still persists?
I don't see a git clone or git checkout command in your build log, are you cloning from Github?
We built with this source code: https://downloads.powerdns.com/releases/pdns-recursor-4.7.1.tar.bz2
Are you checking out the rec-4.7.1 tag or building from master?
No, we built with this source code: https://downloads.powerdns.com/releases/pdns-recursor-4.7.1.tar.bz2
What does the configure summary look like for you on Ubuntu 20.04?
I installed from your Ubuntu repository:
deb [arch=amd64] http://repo.powerdns.com/ubuntu focal-rec-47 main
this binary package:
https://repo.powerdns.com/ubuntu/pool/main/p/pdns-recursor/pdns-recursor_4.7.1-1pdns.focal_amd64.deb
pdns_recursor --version
Jul 27 06:54:34 PowerDNS Recursor 4.7.1 (C) 2001-2022 PowerDNS.COM BV
Jul 27 06:54:34 Using 64-bits mode. Built using gcc 9.4.0 on Jul 7 2022 09:34:01 by root@c79fa81baeba.
Jul 27 06:54:34 PowerDNS comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it according to the terms of the GPL version 2.
Jul 27 06:54:34 Features: fcontext libcrypto-ecdsa libcrypto-ed25519 libcrypto-ed448 libcrypto-eddsa lua nod protobuf dnstap-framestream sodium curl DoT scrypt
Jul 27 06:54:34 Configured with: " '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--libexecdir=${prefix}/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--disable-dependency-tracking' '--sysconfdir=/etc/powerdns' '--enable-unit-tests' '--with-libcap' '--with-libsodium' '--enable-dns-over-tls' '--enable-dnstap' '--without-net-snmp' '--disable-silent-rules' '--with-service-user=pdns' '--with-service-group=pdns' '--enable-systemd' '--with-systemd=/lib/systemd/system' '--with-lua=luajit' 'build_alias=x86_64-linux-gnu' 'CFLAGS=-g -O2 -fdebug-prefix-map=/pdns/pdns-recursor-4.7.1=. -fstack-protector-strong -Wformat -Werror=format-security' 'LDFLAGS=-Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,now' 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2' 'CXXFLAGS=-g -O2 -fdebug-prefix-map=/pdns/pdns-recursor-4.7.1=. -fstack-protector-strong -Wformat -Werror=format-security'"
Is the Ubuntu installation up to date?
Yes.
Which version of Lua is it picking up?
dpkg -l | grep -i lua
ii liblua5.1-0:amd64 5.1.5-8.1build4 amd64 Shared library for the Lua interpreter version 5.1
ii liblua5.1-0-dev:amd64 5.1.5-8.1build4 amd64 Development files for the Lua language version 5.1
ii liblua5.2-0:amd64 5.2.4-1.1build3 amd64 Shared library for the Lua interpreter version 5.2
ii liblua5.2-dev:amd64 5.2.4-1.1build3 amd64 Development files for the Lua language version 5.2
ii liblua5.3-0:amd64 5.3.3-1.1ubuntu2 amd64 Shared library for the Lua interpreter version 5.3
ii libluajit-5.1-2:amd64 2.1.0~beta3+dfsg-5.1build1 amd64 Just in time compiler for Lua - library version
ii libluajit-5.1-common 2.1.0~beta3+dfsg-5.1build1 all Just in time compiler for Lua - common files
ii libluajit-5.1-dev:amd64 2.1.0~beta3+dfsg-5.1build1 amd64 Just in time compiler for Lua - development files
ii libtolua-dev 5.2.0-1build1 amd64 Tool to integrate C/C++ code with Lua - development files
ii lua-lpeg:amd64 1.0.2-1 amd64 LPeg library for the Lua language
ii lua-yaml:amd64 6.1-2 amd64 LibYAML binding for Lua
ii lua-yaml-dev:amd64 6.1-2 amd64 docs and test files for lua-yaml
ii lua5.1 5.1.5-8.1build4 amd64 Simple, extensible, embeddable programming language
ii lua5.3 5.3.3-1.1ubuntu2 amd64 Simple, extensible, embeddable programming language
ii luajit 2.1.0~beta3+dfsg-5.1build1 amd64 Just in time compiler for Lua programming language version 5.1
Could you test with luajit on SLES and Ubuntu, please?
The Ubuntu package is already with luajit. I did it not yet try on SLES.
Could you follow the steps above and let me know whether the issue still persists?
Yes. I did build the Recursor on Ubuntu exactly following your steps. The issue still persists.
I mentioned earlier that the issue depends on the IP address used in the dnstapFrameStreamServer function. Some examples:
"10.20.30.40" => failed
"127.0.0.1" => OK
"1.2.3.4" => failed
"192.168.178.1" => OK
"194.25.2.129" => failed
I have no idea what the trigger is here. Would you mind to try also with some different IP addresses?
And by the way, not only the Recursor doesn't reply to DNS queries in the error case, it does also not respond to socket connections any more:
# ../rec_control --socket-dir=. reload-lua-config
Fatal: Timeout waiting for control channel data
From the Recursor Log:
Jul 27 08:56:56 Reloaded Lua configuration file './lua.conf', requested via control channel
Jul 27 08:56:56 msg="Exception while dealing with control socket request" error="Unable to send return code over control channel: Broken pipe" subsystem="control" level="0" prio="Error" tid="0" ts="1658905016.435" exception="PDNSException"
The problem is not "persistent". It does automatically recover after some time.
I did an experiment on the IP address "194.25.2.129". If this server drops the SYN packets, the Recursor stops working after reload-lua-config
. But if this sever responds with reset packets ("Flags [R.]"), then the Recursor does survive.
I can reproduce it. Thank you!
Got the feeling this is pretty much the limit of what libfstrm can get you in combination with the recursors design :-(
(Back from vacation, I did not look at the code yet...) I'm assuming the blocking calls in the cleanup of the old config are the underlaying problem. Possible solutions that come to mind: an asynchronous cleanup or not creating a new setup (and not destroying the old) if the old and new config are the same. I suppose the latter is more simple to implement and would fix the most common cases.
After a discussion with @omoerbeek, the fix for this is two-fold:
- Avoid reloading dnstap-related configuration if it hasn't changed
- If the configuration did indeed change, destroy the old configuration in a way that does not block the recursor:
- Either through asynchronous destruction of the old configuration object (and related connections)
- Or through forcefully closing said connections (e.g. without waiting for timeouts). Note that this might not be possible with libfstrm
@paddg Could you test this branch, please? Thank you!
Hi @fredmorcos,
@paddg Could you test this branch, please? Thank you!
Done. I did
git clone https://github.com/fredmorcos/pdns.git pdns-git-repo
cd pdns-git-repo
./build-scripts/dist-recursor
Then built a RPM package on OBS with pdns-recursor-0.0.24187.0.master.g9c9e6a1dc3.tar.bz2
I installed the package on a lab node.
# pdns_recursor --version
Aug 12 09:05:03 PowerDNS Recursor 0.0.24187.0.master.g9c9e6a1dc3 (C) 2001-2022 PowerDNS.COM BV
Aug 12 09:05:03 Using 64-bits mode. Built using gcc 7.5.0 on Aug 12 2022 06:13:04 by abuild@lamb11.
Aug 12 09:05:03 PowerDNS comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it according to the terms of the GPL version 2.
Aug 12 09:05:03 Features: fcontext libcrypto-ecdsa libcrypto-ed25519 libcrypto-ed448 libcrypto-eddsa lua nod protobuf dnstap-framestream snmp sodium curl
Aug 12 09:05:03 Configured with: " '--host=x86_64-suse-linux-gnu' '--build=x86_64-suse-linux-gnu' '--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/lib' '--localstatedir=/var' '--sharedstatedir=/var/lib' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--disable-dependency-tracking' '--enable-nod' 'build_alias=x86_64-suse-linux-gnu' 'host_alias=x86_64-suse-linux-gnu' 'CFLAGS=-fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g' 'CXXFLAGS=-fmessage-length=0 -grecord-gcc-switches -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -g' 'PKG_CONFIG_PATH=:/usr/lib64/pkgconfig:/usr/share/pkgconfig'"
An re-did the tests. Unfortunately the behavior did not change. The problem is still present.
Hi @paddg,
if I'm not mistaken you tested the master branch of Fred's repo (9c9e6a1dc3), not the branch containing the fix.
if I'm not mistaken you tested the master branch of Fred's repo (9c9e6a1), not the branch containing the fix.
Oh yes! Sorry about that. I'll test again ...
I tested again and it looks good so far. The Recursor does not stop responding after rec_control reload-lua-config
anymore.
But now the journal keeps claiming things which I haven't seen before:
Aug 12 10:57:33 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294653.147" zone="A.rpz"
Aug 12 10:57:33 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294653.147" zone="B.rpz"
Aug 12 10:57:33 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294653.147" zone="C.rpz"
Aug 12 10:57:33 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294653.148" zone="D.rpz"
Aug 12 10:57:59 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294679.699" zone="B.rpz"
Aug 12 10:57:59 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294679.699" zone="D.rpz"
Aug 12 10:57:59 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294679.699" zone="C.rpz"
Aug 12 10:57:59 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294679.699" zone="A.rpz"
Aug 12 10:58:16 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294696.311" zone="B.rpz"
Aug 12 10:58:16 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294696.312" zone="D.rpz"
Aug 12 10:58:16 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294696.312" zone="C.rpz"
Aug 12 10:58:16 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294696.312" zone="A.rpz"
Aug 12 10:58:32 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294712.524" zone="E.rpz"
Aug 12 10:59:41 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294781.139" zone="B.rpz"
Aug 12 10:59:41 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294781.139" zone="C.rpz"
Aug 12 10:59:41 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294781.139" zone="A.rpz"
Aug 12 10:59:41 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294781.139" zone="D.rpz"
Aug 12 10:59:46 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294786.211" zone="D.rpz"
Aug 12 10:59:46 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294786.211" zone="B.rpz"
Aug 12 10:59:46 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294786.211" zone="C.rpz"
Aug 12 10:59:46 node22 pdns-recursor-2[15528]: msg="A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294786.211" zone="A.rpz"
A more recent configuration has been found, stopping the existing RPZ update thread" subsystem="rpz" level="0" prio="Info" tid="0" ts="1660294653.147" zone="A.rpz"
Any idea what's causing this messages?
Does it keep logging these messages or does it stop at one point?
A few of these messages or normal and expected: an RPZ update might be in progress while the config is being updated. The reload will create a new RPZ loader thread, and when the old loader thread finishes its download, it sees it is no longer relevant, logs and stops.
I tested this locally and my previous message was not accurate: you get the messages even if no RPZ download was happening. You wil see one for each reload and for each RPZ zone. The message will appear after the refresh period has elapsed, so not immediately after the reload command.
So while it is a bit spammy, the log entries are harmless and an indication things are working as intended. I might change the log level from Info
to Debug
though, as an admin can easily be confused by these messages.
You are right, it's only the result of the reload-lua-config
.