FTL icon indicating copy to clipboard operation
FTL copied to clipboard

Sporadic incorrect CNAME resolution from Pi-hole v6, despite correct upstream responses

Open phinor opened this issue 6 months ago • 4 comments
trafficstars

Versions

  • Pi-hole: 6.0.6
  • Web: 6.1
  • FTL: 6.1

Platform

  • OS and version: Ubuntu 22.04.5 LTS
  • Platform: VM

Expected behavior

CNAME values resolve to correct IP.

Actual behavior / bug

After upgrading to Pi-hole v6, I've observed sporadic DNS resolution issues where incorrect CNAMEs are returned. The same requests, when made directly to the configured upstream DNS providers (unbound on port 5335 but have also experienced this with providers such as 8.8.8.8, 1.1.1.1), return correct results. This suggests the issue lies with Pi-hole itself. I use

The issue is intermittent: some requests succeed, others fail or return outdated/misleading CNAMEs. Flushing Pi-hole’s cache resolves the issue.

The domains in questions are hosted on Cloudflare DNS provider. The incorrect resolution is always for the same domain.

e.g. adam.treverton.co.zatreverton.school.adam.co.zaschool15.adam.co.za The erroneous domain that is returned is always adam.co.za

Steps to reproduce

Visit a website. The website may work. If the incorrect IP is returned, the browser reports an SSL CYPHER ERROR but my estimation is that this is because the browser is now talking to a different server with a different cypher suite installed.

Using unbound directly:

pihole@pihole:~$ dig adam.treverton.co.za @127.0.0.1 -p 5335

; <<>> DiG 9.18.30-0ubuntu0.22.04.2-Ubuntu <<>> adam.treverton.co.za @127.0.0.1 -p 5335
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 58635
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;adam.treverton.co.za.          IN      A

;; ANSWER SECTION:
adam.treverton.co.za.   289     IN      CNAME   treverton.school.adam.co.za.
treverton.school.adam.co.za. 289 IN     CNAME   school15.adam.co.za.
school15.adam.co.za.    289     IN      A       160.119.248.143

;; Query time: 0 msec
;; SERVER: 127.0.0.1#5335(127.0.0.1) (UDP)
;; WHEN: Mon May 05 13:49:02 UTC 2025
;; MSG SIZE  rcvd: 124

vs using pihole:

pihole@pihole:~$ dig adam.treverton.co.za @172.30.0.99

; <<>> DiG 9.18.30-0ubuntu0.22.04.2-Ubuntu <<>> adam.treverton.co.za @172.30.0.99
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17135
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;adam.treverton.co.za.          IN      A

;; ANSWER SECTION:
adam.treverton.co.za.   230     IN      CNAME   adam.co.za.
adam.co.za.             52      IN      A       104.21.92.50
adam.co.za.             52      IN      A       172.67.186.139

;; Query time: 168 msec
;; SERVER: 172.30.0.99#53(172.30.0.99) (UDP)
;; WHEN: Mon May 05 13:50:01 UTC 2025
;; MSG SIZE  rcvd: 105

After restarting PiHole's DNS resolver:

pihole@pihole:~$ dig adam.treverton.co.za @172.30.0.99

; <<>> DiG 9.18.30-0ubuntu0.22.04.2-Ubuntu <<>> adam.treverton.co.za @172.30.0.99
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2699
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;adam.treverton.co.za.          IN      A

;; ANSWER SECTION:
adam.treverton.co.za.   105     IN      CNAME   treverton.school.adam.co.za.
treverton.school.adam.co.za. 105 IN     CNAME   school15.adam.co.za.
school15.adam.co.za.    105     IN      A       160.119.248.143

;; Query time: 20 msec
;; SERVER: 172.30.0.99#53(172.30.0.99) (UDP)
;; WHEN: Mon May 05 13:52:06 UTC 2025
;; MSG SIZE  rcvd: 124

Debug Token

  • URL: https://tricorder.pi-hole.net/K5rMHOLR/

phinor avatar May 05 '25 13:05 phinor

You're debug log shows a list of errors/warnings which might or might not be related to your issue

WARNING: you should run this program as super-user.

Did you run pihole -d with sudo powers?


025-05-05 13:45:01.494 UTC [59219/F706] WARNING: Connection error (127.0.0.1#5335): TCP connection failed while receiving payload length from upstream (Connection prematurely closed by remote server)
   2025-05-05 13:45:19.078 UTC [59229/F706] WARNING: Connection error (127.0.0.1#5335): TCP connection failed while receiving payload length from upstream (Connection prematurely closed by remote server)
   2025-05-05 13:46:11.933 UTC [59235/F706] WARNING: Connection error (127.0.0.1#5335): TCP connection failed while receiving payload length from upstream (Connection prematurely closed by remote server)
   2025-05-05 13:47:43.794 UTC [59254/F706] WARNING: Connection error (127.0.0.1#5335): TCP connection failed while receiving payload length from upstream (Connection prematurely closed by remote server)

It seems your unbound <> Pi-hole connection is not working reliable.


Please check your /var/log/pihole/pihole.log for the relevant section when the wrong CNAME is received. Does it indicate the answer is served from cache? Or is it forwarded to the upstream server?

yubiuser avatar May 05 '25 14:05 yubiuser

I will re-run as a sudo user: https://tricorder.pi-hole.net/RkTXBfaQ/

Pi-hole and unbound are on the same machine (127.0.0.1). I can't think how to make the connection any more reliable.

The response to the dig request above made at 13:50:01 appears to be served from cache.

May  5 13:49:51 dnsmasq[67314]: query[A] adam.treverton.co.za from 172.30.0.195
May  5 13:49:51 dnsmasq[67314]: cached adam.treverton.co.za is <CNAME>
May  5 13:49:51 dnsmasq[67315]: query[HTTPS] adam.treverton.co.za from 172.30.0.195
May  5 13:49:51 dnsmasq[67315]: cached adam.treverton.co.za is <CNAME>
May  5 13:49:51 dnsmasq[67315]: forwarded adam.treverton.co.za to 127.0.0.1#5335
May  5 13:49:51 dnsmasq[67315]: reply adam.treverton.co.za is <CNAME>
May  5 13:50:01 dnsmasq[706]: query[A] adam.treverton.co.za from 172.30.0.99
May  5 13:50:01 dnsmasq[706]: cached adam.treverton.co.za is <CNAME>
May  5 13:51:48 dnsmasq[706]: query[A] adam.treverton.co.za from 172.30.0.99
May  5 13:51:48 dnsmasq[706]: cached adam.treverton.co.za is <CNAME>
May  5 13:51:50 dnsmasq[706]: query[A] adam.treverton.co.za from 172.30.0.99
May  5 13:51:50 dnsmasq[706]: cached adam.treverton.co.za is <CNAME>
May  5 13:51:51 dnsmasq[67368]: query[A] adam.treverton.co.za from 172.30.0.195
May  5 13:51:51 dnsmasq[67368]: cached adam.treverton.co.za is <CNAME>
May  5 13:51:51 dnsmasq[67369]: query[HTTPS] adam.treverton.co.za from 172.30.0.195
May  5 13:51:51 dnsmasq[67369]: cached adam.treverton.co.za is <CNAME>
May  5 13:51:51 dnsmasq[67369]: forwarded adam.treverton.co.za to 127.0.0.1#5335
May  5 13:51:51 dnsmasq[67369]: reply adam.treverton.co.za is <CNAME>
May  5 13:52:06 dnsmasq[706]: query[A] adam.treverton.co.za from 172.30.0.99
May  5 13:52:06 dnsmasq[706]: forwarded adam.treverton.co.za to 127.0.0.1#5335
May  5 13:52:06 dnsmasq[706]: reply adam.treverton.co.za is <CNAME>

phinor avatar May 05 '25 15:05 phinor

Looking at the error log output, I note that the same error also occurs with other providers:

2025-05-05 05:25:35.303 UTC [27278/F706] WARNING: Connection error (149.112.112.112#53): TCP connection failed while receiving payload length from upstream (Connection prematurely closed by remote server)
   2025-05-05 05:26:36.473 UTC [27403/F706] WARNING: Connection error (149.112.112.112#53): TCP connection failed while receiving payload length from upstream (Connection prematurely closed by remote server)
   2025-05-05 05:43:48.267 UTC [29502/F706] WARNING: Connection error (149.112.112.112#53): TCP connection failed while receiving payload length from upstream (Connection prematurely closed by remote server)
   2025-05-05 06:00:10.479 UTC [31060/F706] WARNING: Connection error (9.9.9.9#53): TCP connection failed while receiving payload length from upstream (Connection prematurely closed by remote server)
   2025-05-05 07:09:48.511 UTC [34642/F706] WARNING: Connection error (9.9.9.9#53): TCP connection failed while receiving payload length from upstream (Connection prematurely closed by remote server)

phinor avatar May 05 '25 16:05 phinor

TCP connection failed while receiving payload length from upstream

Have a look here: https://discourse.pi-hole.net/t/connection-error-127-0-0-1-5335-tcp-connection-failed-while-receiving-payload-length-from-upstream-connection-prematurely-closed-by-remote-server/76148

Solution seems to increase unbond's incoming-num-tcp.


The response to the dig request above made at 13:50:01 appears to be served from cache.

You are correct. Somehow the wrong information got in Pi-hole's cache. What happens if you flush the cache (or restart Pi-hole) and run the dig against Pi-hole twice in a row. Does the second (served from cache) also reply with the correct answer?

yubiuser avatar May 05 '25 17:05 yubiuser

This issue is stale because it has been open 30 days with no activity. Please comment or update this issue or it will be closed in 5 days.

github-actions[bot] avatar Jun 05 '25 08:06 github-actions[bot]

This isn't related to unbound. Please see more information here where the issue is demonstrated with Google servers.

https://discourse.pi-hole.net/t/cnames-do-not-resolve-correctly/80379

It seems that if Pi-Hole receives/reports a NODATA response, it in turn responds with an incorrect value from its cache. I don't know enough about DNS queries to understand if NODATA represents a valid response to indicate that there is no data or simply a lack of response. I imagine the latter since NXDOMAIN would probably be the more sensible response if the server did not have data, perhaps because of propagation or similar.

phinor avatar Jun 05 '25 08:06 phinor

The issue seems that once NODAT is received for an intermediate CNAME things go wrong. Could be related to https://github.com/pi-hole/FTL/pull/1425

yubiuser avatar Jun 05 '25 12:06 yubiuser