kubespray
kubespray copied to clipboard
all nodelocaldns are failing in OCI deployment
Environment:
- Cloud provider or hardware configuration: OCI instances in seperate VCNs
- Node OS): Ubuntu 20.04
- Kubespray version: 2.18
- Network plugin used: calico
The issue: First of all the kubespray deployment was successful, but the cluster is not healthy. The local-dns-cache ports are all failing, but I can't figure out why. I have the same configuration deploy in another less network restricted env (no firewall, no security rule) and there it works fine. So my guess was that I missed some ports to open for the local-dns-cachd service, but 53 dns port is open now and I still see the error.
The error:
| 2022/07/13 22:17:52 [INFO] Starting node-cache image: 1.17.1 │
│ 2022/07/13 22:17:52 [INFO] Using Corefile /etc/coredns/Corefile │
│ 2022/07/13 22:17:52 [ERROR] Failed to read node-cache coreFile /etc/coredns/Corefile.base - open /etc/coredns/Corefile.base: no such file or directory │
│ 2022/07/13 22:17:52 [INFO] Skipping kube-dns configmap sync as no directory was specified │
│ 2022/07/13 22:17:52 [INFO] Added interface - nodelocaldns │
│ cluster.local.:53 on 169.254.25.10 │
│ in-addr.arpa.:53 on 169.254.25.10 │
│ ip6.arpa.:53 on 169.254.25.10 │
│ .:53 on 169.254.25.10 │
│ [INFO] plugin/reload: Running configuration MD5 = adf97d6b4504ff12113ebb35f0c6413e │
│ CoreDNS-1.7.0 │
│ linux/arm64, go1.13.15, │
│ [INFO] Added back nodelocaldns rule - {raw PREROUTING [-p tcp -d 169.254.25.10 --dport 53 -j NOTRACK]} │
│ [INFO] Added back nodelocaldns rule - {raw PREROUTING [-p udp -d 169.254.25.10 --dport 53 -j NOTRACK]} │
│ [INFO] Added back nodelocaldns rule - {filter INPUT [-p tcp -d 169.254.25.10 --dport 53 -j ACCEPT]} │
│ [INFO] Added back nodelocaldns rule - {filter INPUT [-p udp -d 169.254.25.10 --dport 53 -j ACCEPT]} │
│ [INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -s 169.254.25.10 --sport 53 -j NOTRACK]} │
│ [INFO] Added back nodelocaldns rule - {raw OUTPUT [-p udp -s 169.254.25.10 --sport 53 -j NOTRACK]} │
│ [INFO] Added back nodelocaldns rule - {filter OUTPUT [-p tcp -s 169.254.25.10 --sport 53 -j ACCEPT]} │
│ [INFO] Added back nodelocaldns rule - {filter OUTPUT [-p udp -s 169.254.25.10 --sport 53 -j ACCEPT]} │
│ [INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -d 169.254.25.10 --dport 53 -j NOTRACK]} │
│ [INFO] Added back nodelocaldns rule - {raw OUTPUT [-p udp -d 169.254.25.10 --dport 53 -j NOTRACK]} │
│ [INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -d 169.254.25.10 --dport 8080 -j NOTRACK]} │
│ [INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -s 169.254.25.10 --sport 8080 -j NOTRACK]}
| [INFO] SIGTERM: Shutting down servers then terminating │
│ [INFO] Tearing down │
│ [WARNING] Exiting iptables/interface check goroutine
Seems that this is realated to these issues of the localdnscache
https://github.com/kubernetes/dns/issues/453 https://github.com/kubernetes/dns/issues/394
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.