https-dns-proxy: dmesg page fault
Maintainer: @stangri
| -- | Value |
|---|---|
| Model | TP-Link Archer C7 v2 |
| Architecture | Qualcomm Atheros QCA9558 ver 1 rev 0 |
| Target Platform | ath79/generic |
| Firmware Version | OpenWrt 22.03.0 r19685-512e76967f / LuCI openwrt-22.03 branch git-22.245.77528-487e58a |
| Kernel Version | 5.10.138 |
Description:
added in https-dns-proxy into imagebuilder here. At boot, the kernel log shows:
[ 78.258834] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 63222510
[ 78.268317] epc = 77e72df4 in libc.so[77df4000+a9000]
[ 78.273488] ra = 77e74b04 in libc.so[77df4000+a9000]
...
[ 89.156259] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 6322251a
[ 89.165861] epc = 77dc2dc0 in libc.so[77d44000+a9000]
[ 89.171129] ra = 77dc4b04 in libc.so[77d44000+a9000]
...
[ 94.439188] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 63222520
[ 94.448917] epc = 77e2cdf4 in libc.so[77dae000+a9000]
[ 94.454103] ra = 77e2eb04 in libc.so[77dae000+a9000]
[ 99.763575] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 63222525
[ 99.773390] epc = 77e66dc0 in libc.so[77de8000+a9000]
[ 99.778544] ra = 77e68b04 in libc.so[77de8000+a9000]
[ 104.985969] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 6322284c
[ 104.995583] epc = 77e4adf4 in libc.so[77dcc000+a9000]
[ 105.000758] ra = 77e4cb04 in libc.so[77dcc000+a9000]
[ 129.174846] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 63222865
[ 129.184505] epc = 77e78dc0 in libc.so[77dfa000+a9000]
[ 129.189687] ra = 77e7ab04 in libc.so[77dfa000+a9000]
[ 134.335382] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 6322286a
[ 134.344791] epc = 77e4adc0 in libc.so[77dcc000+a9000]
[ 134.350131] ra = 77e4cb04 in libc.so[77dcc000+a9000]
[ 139.404103] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 6322286f
[ 139.413861] epc = 77e28dc0 in libc.so[77daa000+a9000]
[ 139.419098] ra = 77e2ab04 in libc.so[77daa000+a9000]
[ 144.494286] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 63222874
[ 144.503690] epc = 77de2df4 in libc.so[77d64000+a9000]
[ 144.508995] ra = 77de4b04 in libc.so[77d64000+a9000]
[ 149.580321] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 63222879
[ 149.589720] epc = 77eb2dc0 in libc.so[77e34000+a9000]
[ 149.595016] ra = 77eb4b04 in libc.so[77e34000+a9000]
[ 154.672733] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 6322287e
[ 154.682129] epc = 77eb4dc0 in libc.so[77e36000+a9000]
[ 154.687367] ra = 77eb6b04 in libc.so[77e36000+a9000]
Can you please post the output of opkg list-installed.
Here, sorry about that.
---trim---
Here, sorry about that.
I've had a wild theory that maybe the image was somehow built with some dependencies missing, but everything seems to be present.
I don't have a technical expertise to troubleshoot it further, I'd like to escalate this to the upstream to seek guidance, it could be beneficial to be on the latest version of libcurl for that.
I've created a PR for the 7.85.0 version of curl in master, I'll cherry-pick it for the release branches within a few days unless there are issues reported for the updated version in a master branch.
The segmentation fault (22.03) only occurs when logging is enabled and verbosity > 0.
N.B. - setting logging / verbosity for 21.02 does not page fault
The segmentation fault (22.03) only occurs when logging is enabled and verbosity > 0.
N.B. - setting logging / verbosity for 21.02 does not page fault
Thanks for adding this. Just so I have the complete information for upstream, can you please update curl/ibcurl to 7.85.0 and include the output for: curl --version?
opkg list-installed curl libcurl4 - 7.83.1-3
opkg install libcurl Upgrading libcurl4 on root from 7.83.1-3 to 7.85.0-1...
vi /etc/config/https-dns-proxy
service https-dns-proxy restart
Tue Sep 27 18:00:31 2022 user.notice https-dns-proxy: Starting service ✓✓✓✓ Tue Sep 27 18:00:31 2022 daemon.info dnsmasq[1]: read /tmp/hosts/dhcp.cfg01411c - 0 addresses Tue Sep 27 18:00:31 2022 kern.info kernel: [52874.728769] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 63331e1f Tue Sep 27 18:00:31 2022 kern.info kernel: [52874.738717] epc = 77e0cdc0 in libc.so[77d8e000+a9000] Tue Sep 27 18:00:31 2022 kern.info kernel: [52874.743959] ra = 77e0eb04 in libc.so[77d8e000+a9000]
I can't reproduce this issue with setting verbosity to 2.
Verbosity setting altered from '1' to '2' with no change to the observed result (SIGSEGV).
Kindly post your /etc/config/https-dns-proxy configuration.
As seen in the syslog snippet, 4X DOH servers are configured.
FYI - the device is TP-Link Archer C6 v2
So the trail ends here?
Well, it doesn't seem critical, since it works normally after bootup, when the system settles down. Maybe before config is loaded, the internal default settings trigger the crash?
I need a config from anyone experiencing this issue to try to reproduce this.
output of cat /etc/config/https-dns-proxy
config main 'config'
option update_dnsmasq_config '*'
option force_dns '1'
list force_dns_port '53'
list force_dns_port '853'
config https-dns-proxy
option listen_addr '127.0.0.1'
option user 'nobody'
option group 'nogroup'
option bootstrap_dns '74.82.42.42,2001:470:20::2'
option resolver_url 'https://ordns.he.net/dns-query'
option listen_port '5054'
config https-dns-proxy
option bootstrap_dns '116.202.176.26,1.1.1.1'
option resolver_url 'https://doh.libredns.gr/dns-query'
option listen_port '5055'
config https-dns-proxy
option resolver_url 'https://cloudflare-dns.com/dns-query'
option bootstrap_dns '1.1.1.1,1.0.0.1,2606:4700:4700::1111,2606:4700:4700::1001'
Uncomment both 'logfile' and 'verbosity' options of any one of the four sections -
config main 'config'
option update_dnsmasq_config '*'
option force_dns '1'
list force_dns_port '53'
list force_dns_port '853'
# ports listed below are used by some
# of the dnscrypt-proxy v1 resolvers
# list force_dns_port '553'
# list force_dns_port '1443'
# list force_dns_port '4343'
# list force_dns_port '4434'
# list force_dns_port '5443'
# list force_dns_port '8443'
config https-dns-proxy
option bootstrap_dns '2a01:3a0:53:53::'
option resolver_url 'https://unicast.censurfridns.dk/dns-query'
option use_ipv6_resolvers_only 1
# option logfile '/var/log/DoH_unicast.censurfridns.dk.log'
# option verbosity 1
config https-dns-proxy
option bootstrap_dns '2a07:a8c0::84:5e57,2a07:a8c1::84:5e57'
option resolver_url 'https://dns.nextdns.io/845e57'
option use_ipv6_resolvers_only 1
# option logfile '/var/log/DoH_dns.nextdns.io.log'
# option verbosity 1
config https-dns-proxy
option bootstrap_dns '2001:67c:28a4::'
option resolver_url 'https://anycast.uncensoreddns.org/dns-query'
option use_ipv6_resolvers_only 1
# option logfile '/var/log/DoH_anycast.censurfridns.dk.log'
# option verbosity 1
config https-dns-proxy
option bootstrap_dns '2a07:a8c0::84:5e57,2a07:a8c1::84:5e57'
option resolver_url 'https://secure.avastdns.com/dns-query'
option use_ipv6_resolvers_only 1
# option logfile '/var/log/DoH_secure.avastdns.com.log'
# option verbosity 1
@systemcrash
output of
cat /etc/config/https-dns-proxyconfig main 'config' option update_dnsmasq_config '*' option force_dns '1' list force_dns_port '53' list force_dns_port '853' config https-dns-proxy option listen_addr '127.0.0.1' option user 'nobody' option group 'nogroup' option bootstrap_dns '74.82.42.42,2001:470:20::2' option resolver_url 'https://ordns.he.net/dns-query' option listen_port '5054' config https-dns-proxy option bootstrap_dns '116.202.176.26,1.1.1.1' option resolver_url 'https://doh.libredns.gr/dns-query' option listen_port '5055' config https-dns-proxy option resolver_url 'https://cloudflare-dns.com/dns-query' option bootstrap_dns '1.1.1.1,1.0.0.1,2606:4700:4700::1111,2606:4700:4700::1001'
Review the arguments of the spawned processes. Without the 'use_ipv6_resolvers_only' option set, only the IPV4 addresses are passed and using the option, only IPV6 addresses are passed.
option logfile '/var/log/DoH_unicast.censurfridns.dk.log'
I believe /var/log is owned by root and other users don't have the write permissions there, so nobody/nogroup can't write there. Could the crash be because of that?
No, the process(es) is spawned under 'root' and during initialisation, updates its credentials, as does DNSmasq.
root@pWPW10-2:~# ls -l /var/log/
-rw-r----- 1 root root 251521 Oct 5 10:33 DoH_anycast.censurfridns.dk_IPV6.log
-rw-r----- 1 root root 36170 Oct 5 10:33 DoH_dns.digitale-gesellschaft.ch_IPV6.log
-rw-r----- 1 root root 49522 Oct 5 10:33 DoH_dns.nextdns.io_IPV6.log
-rw-r----- 1 root root 251378 Oct 5 10:33 DoH_secure.avastdns.com_IPV6.log
-rw-r----- 1 root root 249875 Oct 5 10:33 DoH_unicast.censurfridns.dk_IPV6.log
4614 nobody 4180 S /usr/sbin/https-dns-proxy -r https://unicast.censurfridns.dk/dns-query -a 127.0.0.1 -p 5053 -b 2a01:3a0:53
4615 nobody 5352 S /usr/sbin/https-dns-proxy -r https://dns.nextdns.io/845e57 -a 127.0.0.1 -p 5054 -b 2a07:a8c0::84:5e57,2a07
4616 nobody 4180 S /usr/sbin/https-dns-proxy -r https://anycast.uncensoreddns.org/dns-query -a 127.0.0.1 -p 5055 -b 2001:67c:
4617 nobody 4676 S /usr/sbin/https-dns-proxy -r https://dns.digitale-gesellschaft.ch/dns-query -a 127.0.0.1 -p 5056 -b 2606:4
4618 nobody 4180 S /usr/sbin/https-dns-proxy -r https://secure.avastdns.com/dns-query -a 127.0.0.1 -p 5057 -b 2001:67c:28a4::
Under 21.03, this config with logging does not SIGSEGV.
# ps | grep -i https-dns
5059 nobody 3304 S /usr/sbin/https-dns-proxy -r https://ordns.he.net/dns-query -a 127.0.0.1 -p 5054 -b 74.82.42.42 -4 -u nobody -g nogroup
5060 nobody 3336 S /usr/sbin/https-dns-proxy -r https://doh.libredns.gr/dns-query -a 127.0.0.1 -p 5055 -b 116.202.176.26,1.1.1.1 -4 -u nobody -g nogroup
Review the arguments of the spawned processes. Without the 'use_ipv6_resolvers_only' option set, only the IPV4 addresses are passed and using the option, only IPV6 addresses are passed.
I see. This config is generated by using luci GUI. I would have expected that both v4/6 are used from the config without specific intervention. @stangri can there be middle ground option for both v4/6? (don't wish to derail this bug report tho)
No, the process(es) is spawned under 'root' and during initialisation, updates its credentials, as does DNSmasq.
root@pWPW10-2:~# ls -l /var/log/ -rw-r----- 1 root root 251521 Oct 5 10:33 DoH_anycast.censurfridns.dk_IPV6.log -rw-r----- 1 root root 36170 Oct 5 10:33 DoH_dns.digitale-gesellschaft.ch_IPV6.log -rw-r----- 1 root root 49522 Oct 5 10:33 DoH_dns.nextdns.io_IPV6.log -rw-r----- 1 root root 251378 Oct 5 10:33 DoH_secure.avastdns.com_IPV6.log -rw-r----- 1 root root 249875 Oct 5 10:33 DoH_unicast.censurfridns.dk_IPV6.log 4614 nobody 4180 S /usr/sbin/https-dns-proxy -r https://unicast.censurfridns.dk/dns-query -a 127.0.0.1 -p 5053 -b 2a01:3a0:53 4615 nobody 5352 S /usr/sbin/https-dns-proxy -r https://dns.nextdns.io/845e57 -a 127.0.0.1 -p 5054 -b 2a07:a8c0::84:5e57,2a07 4616 nobody 4180 S /usr/sbin/https-dns-proxy -r https://anycast.uncensoreddns.org/dns-query -a 127.0.0.1 -p 5055 -b 2001:67c: 4617 nobody 4676 S /usr/sbin/https-dns-proxy -r https://dns.digitale-gesellschaft.ch/dns-query -a 127.0.0.1 -p 5056 -b 2606:4 4618 nobody 4180 S /usr/sbin/https-dns-proxy -r https://secure.avastdns.com/dns-query -a 127.0.0.1 -p 5057 -b 2001:67c:28a4::Under 21.03, this config with logging does not SIGSEGV.
Sorry, I'm confused, I thought you said that SIGSEGV happens at boot, yet there are non-zero log files. Could you please clarify the conditions it crashes and conditions it doesn't crash and fills up logs?
I see. This config is generated by using luci GUI. I would have expected that both v4/6 are used from the config without specific intervention. @stangri can there be middle ground option for both v4/6? (don't wish to derail this bug report tho)
Yes, please open a separate issue.
@stangri
Sorry, I'm confused, I thought you said that SIGSEGV happens at boot, yet there are non-zero log files. Could you please clarify the conditions it crashes and conditions it doesn't crash and fills up logs?
Those are logging files under 21.02 to illustrate that 'nobody' is successfully writing to them.
I see. I'll need to test this myself, I'll update this issue on the progress.
Well, I don't think it's dangerous, although fixing it is necessary. After the system is up and stable, no more of those entries.
Maybe something isn't ready during startup.
I'll test when I can, if it comes to that.
I can't reproduce the https-dns-proxy crash with version 2022-08-12 and the following config:
config https-dns-proxy
option bootstrap_dns '149.112.121.10,149.112.122.10,2620:10A:80BB::10,2620:10A:80BC::10'
option resolver_url 'https://private.canadianshield.cira.ca/dns-query'
option verbosity '2'
option logfile '/var/log/DoH_canadianshield.log'
@Ohfalderal can you re-test with the new version and shortened file names for the logfile? If we can narrow down the source of the problem before reporting upstream, that'd be helpful.
@stangri I notice the update is available in the mirror repository that is configured in the AP and will pull and test it this coming weekend.
@stangri Uhh, no -
root@pWPW10-0:~# vi /etc/config/https-dns-proxy
root@pWPW10-0:~# service https-dns-proxy restart
Stopping https-dns-proxy 2022-08-12-1 ✓
Starting https-dns-proxy 2022-08-12-1 ✓✓✓
root@pWPW10-0:~# logread -l20
Sat Oct 15 09:32:55 2022 daemon.info dnsmasq[1]: using only locally-known addresses for local
Sat Oct 15 09:32:55 2022 daemon.info dnsmasq[1]: using only locally-known addresses for invalid
Sat Oct 15 09:32:55 2022 daemon.info dnsmasq[1]: using only locally-known addresses for bind
Sat Oct 15 09:32:55 2022 daemon.info dnsmasq[1]: using only locally-known addresses for mask-h2.icloud.com
Sat Oct 15 09:32:55 2022 daemon.info dnsmasq[1]: using only locally-known addresses for mask.icloud.com
Sat Oct 15 09:32:55 2022 daemon.info dnsmasq[1]: using 1 more local addresses
Sat Oct 15 09:32:55 2022 daemon.info dnsmasq[1]: read /etc/hosts - 4 addresses
Sat Oct 15 09:32:55 2022 daemon.info dnsmasq[1]: read /tmp/hosts/dhcp.cfg01411c - 0 addresses
Sat Oct 15 09:32:55 2022 kern.info kernel: [22419.664161] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 634a6227
Sat Oct 15 09:32:55 2022 kern.info kernel: [22419.673903] epc = 77eb0dc0 in libc.so[77e32000+a9000]
Sat Oct 15 09:32:55 2022 kern.info kernel: [22419.679387] ra = 77eb2b04 in libc.so[77e32000+a9000]
Sat Oct 15 09:33:00 2022 kern.info kernel: [22424.768945] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 634a622c
Sat Oct 15 09:33:00 2022 kern.info kernel: [22424.778701] epc = 77e7adf4 in libc.so[77dfc000+a9000]
Sat Oct 15 09:33:00 2022 kern.info kernel: [22424.784060] ra = 77e7cb04 in libc.so[77dfc000+a9000]
Sat Oct 15 09:33:05 2022 kern.info kernel: [22429.810433] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 634a6231
Sat Oct 15 09:33:05 2022 kern.info kernel: [22429.820151] epc = 77e16dc0 in libc.so[77d98000+a9000]
Sat Oct 15 09:33:05 2022 kern.info kernel: [22429.825650] ra = 77e18b04 in libc.so[77d98000+a9000]
Sat Oct 15 09:33:11 2022 kern.info kernel: [22434.848670] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 634a6236
Sat Oct 15 09:33:11 2022 kern.info kernel: [22434.858382] epc = 77eb6dc0 in libc.so[77e38000+a9000]
Sat Oct 15 09:33:11 2022 kern.info kernel: [22434.863713] ra = 77eb8b04 in libc.so[77e38000+a9000]
root@pWPW10-0:~# vi /etc/config/https-dns-proxy
root@pWPW10-0:~# service https-dns-proxy restart
Stopping https-dns-proxy 2022-08-12-1 ✓
Starting https-dns-proxy 2022-08-12-1 ✓✓✓
root@pWPW10-0:~# logread -l20
Sat Oct 15 09:34:13 2022 user.notice https-dns-proxy: Starting service ✓✓✓
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: started, version 2.86 cachesize 150
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: DNS service limited to local subnets
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: compile time options: IPv6 GNU-getopt no-DBus UBus no-i18n no-IDN DHCP no-DHCPv6 no-Lua TFTP no-conntrack no-ipset no-auth no-cryptohash no-DNSSEC no-ID loop-detect inotify dumpfile
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: UBus support enabled: connected to system bus
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: using nameserver 127.0.0.1#5055
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: using nameserver 127.0.0.1#5053
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: using nameserver 127.0.0.1#5054
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: using only locally-known addresses for test
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: using only locally-known addresses for onion
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: using only locally-known addresses for localhost
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: using only locally-known addresses for local
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: using only locally-known addresses for invalid
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: using only locally-known addresses for bind
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: using only locally-known addresses for mask-h2.icloud.com
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: using only locally-known addresses for mask.icloud.com
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: using 1 more local addresses
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: read /etc/hosts - 4 addresses
Sat Oct 15 09:34:13 2022 daemon.info dnsmasq[1]: read /tmp/hosts/dhcp.cfg01411c - 0 addresses
Sat Oct 15 09:34:27 2022 user.info : luci: accepted login on / for root from 192.168.179.198
root@pWPW10-0:~#
N.B. - 'vi' commands are:
- reduce length of logfile name, increase verbosity level '2' and uncomment
- comment logfile and verbosity
Problem remains in 22.03.2
| Name | Value |
|---|---|
| Model | TP-Link Archer C7 v2 |
| Architecture | Qualcomm Atheros QCA9558 ver 1 rev 0 |
| Target Platform | ath79/generic |
| Firmware Version | OpenWrt 22.03.2 r19803-9a599fee93 / LuCI openwrt-22.03 branch git-22.288.45147-96ec0cd |
| Kernel Version | 5.10.146 |
[ 86.418247] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 634ebf2b
[ 86.427647] epc = 77de2dc0 in libc.so[77d64000+a9000]
[ 86.432809] ra = 77de4b04 in libc.so[77d64000+a9000]
...
[ 88.751136] br-lan: port 3(wlan0) entered blocking state
[ 88.756655] br-lan: port 3(wlan0) entered disabled state
[ 88.762531] device wlan0 entered promiscuous mode
[ 89.926855] IPv6: ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready
[ 89.933560] br-lan: port 2(wlan1) entered blocking state
[ 89.938957] br-lan: port 2(wlan1) entered forwarding state
[ 91.949680] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 634ebf31
[ 91.959038] epc = 77e4cdc0 in libc.so[77dce000+a9000]
[ 91.964227] ra = 77e4eb04 in libc.so[77dce000+a9000]
[ 97.198637] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 634ebf36
[ 97.208262] epc = 77eb4dc0 in libc.so[77e36000+a9000]
[ 97.213451] ra = 77eb6b04 in libc.so[77e36000+a9000]
[ 102.401394] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 634ebfa4
[ 102.410796] epc = 77dcedf4 in libc.so[77d50000+a9000]
[ 102.415934] ra = 77dd0b04 in libc.so[77d50000+a9000]
[ 107.647482] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 634ebfa9
[ 107.656919] epc = 77dc6dc0 in libc.so[77d48000+a9000]
[ 107.662082] ra = 77dc8b04 in libc.so[77d48000+a9000]
[ 112.748180] do_page_fault(): sending SIGSEGV to https-dns-proxy for invalid read access from 634ebfae
[ 112.757920] epc = 77e0edc0 in libc.so[77d90000+a9000]
[ 112.763103] ra = 77e10b04 in libc.so[77d90000+a9000]
Here is the build log for the firmware:
---trim---
Sorry for neglecting this. Can you still reproduce with 2022-10-15-10?
No worries. You've a lot on your plate.
It crashes, but now not only at boot, but also later during uptime.
It may be the mips_24kc exclusive problem, I can't (for the life of me) reproduce on x86_64. I don't even know who I can ask to have a look at it, as I don't think upstream devs use the same platform and I don't know enough about libc to troubleshoot.