talos
talos copied to clipboard
1.7.1 with hostdns and forwardKubeDNSToHost doesn't resolve anything
Bug Report
Description
This is on a cluster that has been upgraded (1.6.x->1.7.x), not fresh
after applying the following patch:
machine:
features:
hostDNS:
enabled: true
forwardKubeDNSToHost: true
nothing seems to resolve dns, either in the cluster or externally
getupstream:
talosctl -n 192.168.5.10,192.168.5.11,192.168.5.12,192.168.5.15 get dnsupstream
NODE NAMESPACE TYPE ID VERSION HEALTHY ADDRESS
192.168.5.10 network DNSUpstream 192.168.5.1 1 true 192.168.5.1:53
192.168.5.11 network DNSUpstream 192.168.5.1 1 true 192.168.5.1:53
192.168.5.12 network DNSUpstream 192.168.5.1 1 true 192.168.5.1:53
192.168.5.15 network DNSUpstream 192.168.5.1 1 true 192.168.5.1:53
resolv.conf
talosctl -n 192.168.5.10 read /system/resolved/resolv.conf
nameserver 10.96.0.9
talosctl -n 192.168.5.10 read /etc/resolv.conf
nameserver 127.0.0.53
resolvers
talosctl -n 192.168.5.10 get resolvers
NODE NAMESPACE TYPE ID VERSION RESOLVERS
192.168.5.10 network ResolverStatus resolvers 2 ["192.168.5.1"]
CoreDNS was restarted twice after applying the patch.
Logs
[ERROR] plugin/errors: 2 radarr.media.svc. AAAA: read udp 10.244.2.33:48571->10.96.0.9:53: i/o timeout
[INFO] 10.244.0.142:37010 - 44799 "AAAA IN radarr.media.svc. udp 34 false 512" - - 0 2.001171487s
[ERROR] plugin/errors: 2 radarr.media.svc. AAAA: read udp 10.244.0.228:41133->10.96.0.9:53: i/o timeout
[INFO] 10.244.0.142:37010 - 44353 "A IN radarr.media.svc. udp 34 false 512" - - 0 2.001187098s
[ERROR] plugin/errors: 2 radarr.media.svc. A: read udp 10.244.0.228:49164->10.96.0.9:53: i/o timeout
[INFO] 10.244.0.30:55153 - 65462 "AAAA IN sonarr.media.svc. udp 34 false 512" - - 0 2.001133409s
[INFO] 10.244.0.30:55153 - 65275 "A IN sonarr.media.svc. udp 34 false 512" - - 0 2.001014136s
[ERROR] plugin/errors: 2 sonarr.media.svc. A: read udp 10.244.2.33:38186->10.96.0.9:53: i/o timeout
[ERROR] plugin/errors: 2 sonarr.media.svc. AAAA: read udp 10.244.2.33:57661->10.96.0.9:53: i/o timeout
[INFO] 10.244.3.16:57244 - 50161 "AAAA IN api.allegion.yonomi.cloud. udp 43 false 512" - - 0 2.001230715s
[ERROR] plugin/errors: 2 api.allegion.yonomi.cloud. AAAA: read udp 10.244.0.228:33430->10.96.0.9:53: i/o timeout
[INFO] 10.244.3.16:57244 - 49553 "A IN api.allegion.yonomi.cloud. udp 43 false 512" - - 0 2.001237302s
[ERROR] plugin/errors: 2 api.allegion.yonomi.cloud. A: read udp 10.244.0.228:47070->10.96.0.9:53: i/o timeout
[INFO] 10.244.1.242:47829 - 1031 "AAAA IN api.doppler.com. udp 44 false 1232" - - 0 2.001031405s
[INFO] 10.244.1.242:50138 - 44842 "A IN api.doppler.com. udp 44 false 1232" - - 0 2.001066446s
[ERROR] plugin/errors: 2 api.doppler.com. AAAA: read udp 10.244.0.228:52401->10.96.0.9:53: i/o timeout
[ERROR] plugin/errors: 2 api.doppler.com. A: read udp 10.244.0.228:48637->10.96.0.9:53: i/o timeout
Environment
- Talos version: 1.7.1
- Kubernetes version: 1.30.0
- Platform: dell 5060/7060 nodes
Reverting the patch (false/false) fixes dns again.
FWIW, here's my coredns configmap:
.:53 {
errors
health {
lameduck 5s
}
ready
log . {
class error
}
prometheus :9153
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
coredns graphs go through the roof as well
Greetings! Can you provide talosctl -n 192.168.5.10,192.168.5.11,192.168.5.12,192.168.5.15 logs dns-resolve-cache output?
Greetings! Can you provide
talosctl -n 192.168.5.10,192.168.5.11,192.168.5.12,192.168.5.15 logs dns-resolve-cacheoutput?
sure! with
machine:
features:
hostDNS:
enabled: true
resolveMemberNames: true
forwardKubeDNSToHost: false
i get ~13k lines, here's the last few:
192.168.5.12: 2024-05-05T19:15:00.325Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 27405\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1\n\n;; OPT PSEUDOSECTION:\n; EDNS: version 0; flags:; udp: 1232\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.12: 2024-05-05T19:15:00.326Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 27405\n;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.12: 2024-05-05T19:15:20.325Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 30173\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1\n\n;; OPT PSEUDOSECTION:\n; EDNS: version 0; flags:; udp: 1232\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.12: 2024-05-05T19:15:20.326Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 30173\n;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.12: 2024-05-05T19:15:40.325Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 26124\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1\n\n;; OPT PSEUDOSECTION:\n; EDNS: version 0; flags:; udp: 1232\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.12: 2024-05-05T19:15:40.326Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 26124\n;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.12: 2024-05-05T19:16:00.324Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 44019\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1\n\n;; OPT PSEUDOSECTION:\n; EDNS: version 0; flags:; udp: 1232\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.12: 2024-05-05T19:16:00.326Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 44019\n;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.12: 2024-05-05T19:16:20.325Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 26814\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1\n\n;; OPT PSEUDOSECTION:\n; EDNS: version 0; flags:; udp: 1232\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.12: 2024-05-05T19:16:20.326Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 26814\n;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.12: 2024-05-05T19:16:40.324Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 44389\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1\n\n;; OPT PSEUDOSECTION:\n; EDNS: version 0; flags:; udp: 1232\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.12: 2024-05-05T19:16:40.326Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 44389\n;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.12: 2024-05-05T19:17:00.324Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 59770\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1\n\n;; OPT PSEUDOSECTION:\n; EDNS: version 0; flags:; udp: 1232\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.12: 2024-05-05T19:17:00.326Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 59770\n;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.12: 2024-05-05T19:17:20.324Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 20152\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1\n\n;; OPT PSEUDOSECTION:\n; EDNS: version 0; flags:; udp: 1232\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.12: 2024-05-05T19:17:20.326Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 20152\n;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.12: 2024-05-05T19:17:40.325Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 43480\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1\n\n;; OPT PSEUDOSECTION:\n; EDNS: version 0; flags:; udp: 1232\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.12: 2024-05-05T19:17:40.326Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 43480\n;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
with
machine:
features:
hostDNS:
enabled: true
resolveMemberNames: true
forwardKubeDNSToHost: true
I get
192.168.5.10: 2024-05-05T19:20:52.012Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 13650\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;sonarr.media.svc.\tIN\t A\n"}
192.168.5.10: 2024-05-05T19:20:52.012Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 45522\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;sonarr.media.svc.\tIN\t AAAA\n"}
192.168.5.10: 2024-05-05T19:20:52.013Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NXDOMAIN, id: 13650\n;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;sonarr.media.svc.\tIN\t A\n\n;; AUTHORITY SECTION:\n.\t1800\tIN\tSOA\ta.root-servers.net. nstld.verisign-grs.com. 2024050501 1800 900 604800 86400\n"}
192.168.5.10: 2024-05-05T19:20:52.013Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NXDOMAIN, id: 45522\n;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;sonarr.media.svc.\tIN\t AAAA\n\n;; AUTHORITY SECTION:\n.\t1800\tIN\tSOA\ta.root-servers.net. nstld.verisign-grs.com. 2024050501 1800 900 604800 86400\n"}
192.168.5.10: 2024-05-05T19:21:01.928Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 61250\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1\n\n;; OPT PSEUDOSECTION:\n; EDNS: version 0; flags:; udp: 1232\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.10: 2024-05-05T19:21:01.928Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 61250\n;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.10: 2024-05-05T19:21:02.682Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 61066\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;radarr.domain.io.\tIN\t AAAA\n"}
192.168.5.10: 2024-05-05T19:21:02.682Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 37810\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;radarr.domain.io.\tIN\t A\n"}
192.168.5.10: 2024-05-05T19:21:02.683Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 59884\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;sonarr.domain.io.\tIN\t AAAA\n"}
192.168.5.10: 2024-05-05T19:21:02.683Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 35319\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;sonarr.domain.io.\tIN\t A\n"}
192.168.5.10: 2024-05-05T19:21:02.683Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 61066\n;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;radarr.domain.io.\tIN\t AAAA\n\n;; AUTHORITY SECTION:\ndomain.io.\t1710\tIN\tSOA\trose.ns.cloudflare.com. dns.cloudflare.com. 2340201800 10000 2400 604800 1800\n"}
192.168.5.10: 2024-05-05T19:21:02.683Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 37810\n;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;radarr.domain.io.\tIN\t A\n\n;; ANSWER SECTION:\nradarr.domain.io.\t270\tIN\tA\t10.10.5.30\n"}
192.168.5.10: 2024-05-05T19:21:02.686Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 35319\n;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;sonarr.domain.io.\tIN\t A\n\n;; ANSWER SECTION:\nsonarr.domain.io.\t5\tIN\tA\t10.10.5.30\n"}
192.168.5.10: 2024-05-05T19:21:02.686Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 59884\n;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;sonarr.domain.io.\tIN\t AAAA\n\n;; AUTHORITY SECTION:\ndomain.io.\t1710\tIN\tSOA\trose.ns.cloudflare.com. dns.cloudflare.com. 2340201800 10000 2400 604800 1800\n"}
192.168.5.10: 2024-05-05T19:21:12.216Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 37590\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;.\tIN\t NS\n"}
192.168.5.10: 2024-05-05T19:21:12.217Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 37590\n;; flags: qr rd ra; QUERY: 1, ANSWER: 13, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;.\tIN\t NS\n\n;; ANSWER SECTION:\n.\t3600\tIN\tNS\ta.root-servers.net.\n.\t3600\tIN\tNS\tb.root-servers.net.\n.\t3600\tIN\tNS\tc.root-servers.net.\n.\t3600\tIN\tNS\td.root-servers.net.\n.\t3600\tIN\tNS\te.root-servers.net.\n.\t3600\tIN\tNS\tf.root-servers.net.\n.\t3600\tIN\tNS\tg.root-servers.net.\n.\t3600\tIN\tNS\th.root-servers.net.\n.\t3600\tIN\tNS\ti.root-servers.net.\n.\t3600\tIN\tNS\tj.root-servers.net.\n.\t3600\tIN\tNS\tk.root-servers.net.\n.\t3600\tIN\tNS\tl.root-servers.net.\n.\t3600\tIN\tNS\tm.root-servers.net.\n"}
192.168.5.10: 2024-05-05T19:21:19.651Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 20589\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;s3.domain.io.\tIN\t A\n"}
192.168.5.10: 2024-05-05T19:21:19.651Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 26931\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;s3.domain.io.\tIN\t AAAA\n"}
192.168.5.10: 2024-05-05T19:21:19.652Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 20589\n;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;s3.domain.io.\tIN\t A\n\n;; ANSWER SECTION:\ns3.domain.io.\t296\tIN\tA\t104.21.30.117\ns3.domain.io.\t296\tIN\tA\t172.67.172.226\n"}
192.168.5.10: 2024-05-05T19:21:19.652Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 26931\n;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;s3.domain.io.\tIN\t AAAA\n\n;; ANSWER SECTION:\ns3.domain.io.\t296\tIN\tAAAA\t2606:4700:3035::6815:1e75\ns3.domain.io.\t296\tIN\tAAAA\t2606:4700:3037::ac43:ace2\n"}
192.168.5.10: 2024-05-05T19:21:21.928Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 37418\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1\n\n;; OPT PSEUDOSECTION:\n; EDNS: version 0; flags:; udp: 1232\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.10: 2024-05-05T19:21:21.928Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 37418\n;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.10: 2024-05-05T19:21:30.483Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 41265\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;plex.tv.\tIN\t A\n"}
192.168.5.10: 2024-05-05T19:21:30.483Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 3690\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;plex.tv.\tIN\t AAAA\n"}
192.168.5.10: 2024-05-05T19:21:30.484Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 41265\n;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;plex.tv.\tIN\t A\n\n;; ANSWER SECTION:\nplex.tv.\t30\tIN\tA\t34.243.94.189\nplex.tv.\t30\tIN\tA\t34.241.88.179\n"}
192.168.5.10: 2024-05-05T19:21:30.484Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 3690\n;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;plex.tv.\tIN\t AAAA\n\n;; AUTHORITY SECTION:\nplex.tv.\t207\tIN\tSOA\tjeremy.ns.cloudflare.com. dns.cloudflare.com. 2340420772 10000 2400 604800 1800\n"}
192.168.5.10: 2024-05-05T19:21:41.927Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 38917\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1\n\n;; OPT PSEUDOSECTION:\n; EDNS: version 0; flags:; udp: 1232\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.10: 2024-05-05T19:21:41.928Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 38917\n;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;k8s.lab.domain.io.\tIN\t AAAA\n"}
192.168.5.10: 2024-05-05T19:21:45.777Z DEBUG dns request {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 29216\n;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;sonarr.domain.io.\tIN\t A\n"}
192.168.5.10: 2024-05-05T19:21:45.778Z DEBUG dns response {"component": "dns-resolve-cache", "data": ";; opcode: QUERY, status: NOERROR, id: 29216\n;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0\n\n;; QUESTION SECTION:\n;sonarr.domain.io.\tIN\t A\n\n;; ANSWER SECTION:\nsonarr.domain.io.\t257\tIN\tA\t10.10.5.30\n"}
As soon as the patch is applied and coredns restarted, I start immediately seeing issues, for example in my homeassistant logs:
(SyncWorker_14) [custom_components.radarr_upcoming_media.sensor] Host radarr.domain.io is not available
2024-05-05 12:21:37.684 WARNING (SyncWorker_3) [custom_components.sonarr_upcoming_media.sensor] Host sonarr.domain.io is not available
2024-05-05 12:22:07.685 WARNING (SyncWorker_4) [custom_components.radarr_upcoming_media.sensor] Host radarr.domain.io is not available
2024-05-05 12:22:07.687 WARNING (SyncWorker_50) [custom_components.sonarr_upcoming_media.sensor] Host sonarr.domain.io is not available
2024-05-05 12:22:37.690 WARNING (SyncWorker_46) [custom_components.radarr_upcoming_media.sensor] Host radarr.domain.io is not available
2024-05-05 12:22:37.693 WARNING (SyncWorker_15) [custom_components.sonarr_upcoming_media.sensor] Host sonarr.domain.io is not available
and from the coredns deployment logs itself:
ERROR] plugin/errors: 2 ps.pndsn.com. AAAA: read udp 10.244.3.183:42921->10.96.0.9:53: i/o timeout
[INFO] 10.244.0.142:37176 - 46031 "A IN radarr.media.svc. udp 34 false 512" - - 0 2.000961528s
[INFO] 10.244.0.142:37176 - 46771 "AAAA IN radarr.media.svc. udp 34 false 512" - - 0 2.000981288s
[ERROR] plugin/errors: 2 radarr.media.svc. AAAA: read udp 10.244.0.58:39689->10.96.0.9:53: i/o timeout
[ERROR] plugin/errors: 2 radarr.media.svc. A: read udp 10.244.0.58:56654->10.96.0.9:53: i/o timeout
[INFO] 10.244.0.142:34045 - 18857 "AAAA IN sonarr.media.svc. udp 34 false 512" - - 0 2.0010946020000002s
[ERROR] plugin/errors: 2 sonarr.media.svc. AAAA: read udp 10.244.3.183:57222->10.96.0.9:53: i/o timeout
[INFO] 10.244.0.142:34045 - 18443 "A IN sonarr.media.svc. udp 34 false 512" - - 0 2.001069037s
[ERROR] plugin/errors: 2 sonarr.media.svc. A: read udp 10.244.3.183:60187->10.96.0.9:53: i/o timeout
[INFO] 10.244.0.171:33221 - 59865 "A IN s3.domain.io. udp 33 false 512" - - 0 2.001200777s
[INFO] 10.244.0.171:33221 - 17887 "AAAA IN s3.domain.io. udp 33 false 512" - - 0 2.001220341s
[ERROR] plugin/errors: 2 s3.domain.io. A: read udp 10.244.3.183:42636->10.96.0.9:53: i/o timeout
[ERROR] plugin/errors: 2 s3.domain.io. AAAA: read udp 10.244.3.183:51402->10.96.0.9:53: i/o timeout
[INFO] 10.244.3.9:44995 - 38535 "AAAA IN ps.pndsn.com. udp 30 false 512" - - 0 2.00101046s
[ERROR] plugin/errors: 2 ps.pndsn.com. AAAA: read udp 10.244.3.183:46826->10.96.0.9:53: i/o timeout
[INFO] 10.244.3.9:44995 - 38373 "A IN ps.pndsn.com. udp 30 false 512" - - 0 2.001172459s
[ERROR] plugin/errors: 2 ps.pndsn.com. A: read udp 10.244.3.183:49263->10.96.0.9:53: i/o timeout
[INFO] 10.244.0.213:33246 - 976 "A IN github.com. udp 39 false 1232" - - 0 2.00063978s
[ERROR] plugin/errors: 2 github.com. A: read udp 10.244.3.183:57869->10.96.0.9:53: i/o timeout
[INFO] 10.244.0.213:41944 - 26300 "AAAA IN github.com. udp 39 false 1232" - - 0 2.001602828s
[ERROR] plugin/errors: 2 github.com. AAAA: read udp 10.244.0.58:52988->10.96.0.9:53: i/o timeout
[INFO] 10.244.3.9:44995 - 38535 "AAAA IN ps.pndsn.com. udp 30 false 512" - - 0 2.000211697s
[ERROR] plugin/errors: 2 ps.pndsn.com. AAAA: read udp 10.244.3.183:48895->10.96.0.9:53: i/o timeout
[INFO] 10.244.3.9:44995 - 38373 "A IN ps.pndsn.com. udp 30 false 512" - - 0 2.000241596s
[ERROR] plugin/errors: 2 ps.pndsn.com. A: read udp 10.244.3.183:41034->10.96.0.9:53: i/o timeout
changing forwardKubeDNSToHost: true back to false brings things back to normal. I can post my machine config if that helps but don't have anything too crazy there. upon restarting the coredns deployment, the logs are clean again:
.:53
[INFO] plugin/reload: Running configuration SHA512 = f43368fe881b6cd37b121f37ba0b71c065df5bfc99b5c5c05d7f95bf82289d7ab7e78d5b98c1f02172d8004a8a8f34027cef04e86c780d40a7c5d1301559f5b3
CoreDNS-1.11.1
linux/amd64, go1.20.7, ae2bbc2
.:53
[INFO] plugin/reload: Running configuration SHA512 = f43368fe881b6cd37b121f37ba0b71c065df5bfc99b5c5c05d7f95bf82289d7ab7e78d5b98c1f02172d8004a8a8f34027cef04e86c780d40a7c5d1301559f5b3
CoreDNS-1.11.1
linux/amd64, go1.20.7, ae2bbc2
Same issue for me, but unfortunately disabling hostDNS features doesn't resolve the issue.
I am using my own DNS servers, however using public DNS servers didn't help.
It worked fine using version 1.6.7, failed to work from 1.7.0, keeps failing in 1.7.1.
It worked fine using version 1.6.7, failed to work from 1.7.0, keeps failing in 1.7.1.
Let's not mix different issues in one ticket please.
@evanrich what is the CNI you're using?
@evanrich what is the CNI you're using?
Cilium v1.15.4
I'm seeing the same problem on Talos 1.7.1 (also upgraded from earlier versions), Kubernetes 1.29.1, Cilium 1.15.4.
I am using DHCP-discovered public DNS servers run by Hetzner.
Hubble (Cilium packet inspection) reports that the UDP requests from CoreDNS to the Talos DNS service IP (10.96.0.9 in my case) are delivered, but the response packets from 10.96.0.9 to CoreDNS pod are dropped with the reason TTL Exceeded.
I have the same error. I'm using talos v1.7.1 and cilium v1.14.7
I'm seeing the same problem on Talos 1.7.1 (also upgraded from earlier versions), Kubernetes 1.29.1, Cilium 1.15.4.
I am using DHCP-discovered public DNS servers run by Hetzner.
Hubble (Cilium packet inspection) reports that the UDP requests from CoreDNS to the Talos DNS service IP (10.96.0.9 in my case) are delivered, but the response packets from 10.96.0.9 to CoreDNS pod are dropped with the reason
TTL Exceeded.
Check if you are using bpf.masquerade if yes and you did not specify CIDRs manually, then with common private CIDRs you will get above error.
Try to set bpf.masquerade option to false and check if that works.
I'm seeing the same problem on Talos 1.7.1 (also upgraded from earlier versions), Kubernetes 1.29.1, Cilium 1.15.4. I am using DHCP-discovered public DNS servers run by Hetzner. Hubble (Cilium packet inspection) reports that the UDP requests from CoreDNS to the Talos DNS service IP (10.96.0.9 in my case) are delivered, but the response packets from 10.96.0.9 to CoreDNS pod are dropped with the reason
TTL Exceeded.Check if you are using bpf.masquerade if yes and you did not specify CIDRs manually, then with common private CIDRs you will get above error.
Try to set bpf.masquerade option to false and check if that works.
Sounds very plausible. However, bpf masquerade is disabled for my use case, but I can see that iptables masquerade for ipv4 is enabled. I would assume disabling this would have the same effect?
Edit: I disabled all masquerading:
$ kubectl -n kube-system exec ds/cilium -- cilium-dbg status | grep Masquerading
Masquerading: Disabled
~~But I'm still seeing the exact same issue.~~ I am now seeing the issue with the public IP address of the DNS Server instead.
It seems to me that masquerading is a very likely culprit, but I'm not sure how exactly yet. Will keep digging.
The fix is coming, thanks for reporting it, it's indeed the TTL. It's only related to fowardKubeDNSToHost option which is not enabled by default in Talos 1.7 (only enabled for Docker-based clusters).
Reopened until there is 1.7 backport.
Closed per #8758
@DmitriyMV not sure if this is related, but after upgrading 1.7.1->1.7.2, while better, I now see other errors:
.:53
[INFO] plugin/reload: Running configuration SHA512 = f43368fe881b6cd37b121f37ba0b71c065df5bfc99b5c5c05d7f95bf82289d7ab7e78d5b98c1f02172d8004a8a8f34027cef04e86c780d40a7c5d1301559f5b3
CoreDNS-1.11.1
linux/amd64, go1.20.7, ae2bbc2
.:53
[INFO] plugin/reload: Running configuration SHA512 = f43368fe881b6cd37b121f37ba0b71c065df5bfc99b5c5c05d7f95bf82289d7ab7e78d5b98c1f02172d8004a8a8f34027cef04e86c780d40a7c5d1301559f5b3
CoreDNS-1.11.1
linux/amd64, go1.20.7, ae2bbc2
[INFO] 10.244.0.100:34471 - 62223 "AAAA IN registry.npmjs.org. udp 36 false 512" - - 0 5.00010545s
[ERROR] plugin/errors: 2 registry.npmjs.org. AAAA: dns: buffer size too small
[INFO] 10.244.0.23:42181 - 51958 "AAAA IN api.ring.com. udp 30 false 512" - - 0 5.000088228s
[ERROR] plugin/errors: 2 api.ring.com. AAAA: dns: overflowing header size
[INFO] 10.244.0.23:45773 - 55470 "AAAA IN api.ring.com. udp 30 false 512" - - 0 5.000083312s
[ERROR] plugin/errors: 2 api.ring.com. AAAA: dns: overflowing header size
[INFO] 10.244.0.23:45773 - 55229 "A IN api.ring.com. udp 30 false 512" - - 0 5.000134008s
[ERROR] plugin/errors: 2 api.ring.com. A: dns: overflowing header size
[INFO] 10.244.0.23:45773 - 55229 "A IN api.ring.com. udp 30 false 512" - - 0 5.000123943s
[ERROR] plugin/errors: 2 api.ring.com. A: dns: overflowing header size
[INFO] 10.244.0.23:45773 - 55470 "AAAA IN api.ring.com. udp 30 false 512" - - 0 5.000255962s
[ERROR] plugin/errors: 2 api.ring.com. AAAA: dns: overflowing header size
[INFO] 10.244.0.23:45773 - 55470 "AAAA IN api.ring.com. udp 30 false 512" - - 0 5.000144994s
[ERROR] plugin/errors: 2 api.ring.com. AAAA: dns: overflowing header size
[INFO] 10.244.0.23:40964 - 39446 "AAAA IN api.ring.com. udp 30 false 512" - - 0 5.000402892s
[INFO] 10.244.0.23:40964 - 39215 "A IN api.ring.com. udp 30 false 512" - - 0 5.000479894s
[ERROR] plugin/errors: 2 api.ring.com. A: dns: overflowing header size
[ERROR] plugin/errors: 2 api.ring.com. AAAA: dns: overflowing header size
[INFO] 10.244.0.23:40964 - 39446 "AAAA IN api.ring.com. udp 30 false 512" - - 0 5.000031736s
[ERROR] plugin/errors: 2 api.ring.com. AAAA: dns: overflowing header size
[INFO] 10.244.0.23:40964 - 39215 "A IN api.ring.com. udp 30 false 512" - - 0 5.000118852s
[ERROR] plugin/errors: 2 api.ring.com. A: dns: overflowing header size
[INFO] 10.244.0.23:40964 - 39446 "AAAA IN api.ring.com. udp 30 false 512" - - 0 5.000074868s
[ERROR] plugin/errors: 2 api.ring.com. AAAA: dns: overflowing header size
[INFO] 10.244.0.23:40964 - 39215 "A IN api.ring.com. udp 30 false 512" - - 0 5.000111692s
[ERROR] plugin/errors: 2 api.ring.com. A: dns: overflowing header size
this is based off the following config:
machine:
features:
hostDNS:
enabled: true
resolveMemberNames: true
forwardKubeDNSToHost: true
The only thing i flipped from 1.7.1. to 1.7.2 was the forward to host.
1.7.3 fixes the errors above