kubespray icon indicating copy to clipboard operation
kubespray copied to clipboard

all nodelocaldns are failing in OCI deployment

Open jakoberpf opened this issue 2 years ago • 1 comments

Environment:

  • Cloud provider or hardware configuration: OCI instances in seperate VCNs
  • Node OS): Ubuntu 20.04
  • Kubespray version: 2.18
  • Network plugin used: calico

The issue: First of all the kubespray deployment was successful, but the cluster is not healthy. The local-dns-cache ports are all failing, but I can't figure out why. I have the same configuration deploy in another less network restricted env (no firewall, no security rule) and there it works fine. So my guess was that I missed some ports to open for the local-dns-cachd service, but 53 dns port is open now and I still see the error.

The error:

| 2022/07/13 22:17:52 [INFO] Starting node-cache image: 1.17.1                                                                                                 │
│ 2022/07/13 22:17:52 [INFO] Using Corefile /etc/coredns/Corefile                                                                                              │
│ 2022/07/13 22:17:52 [ERROR] Failed to read node-cache coreFile /etc/coredns/Corefile.base - open /etc/coredns/Corefile.base: no such file or directory       │
│ 2022/07/13 22:17:52 [INFO] Skipping kube-dns configmap sync as no directory was specified                                                                    │
│ 2022/07/13 22:17:52 [INFO] Added interface - nodelocaldns                                                                                                    │
│ cluster.local.:53 on 169.254.25.10                                                                                                                           │
│ in-addr.arpa.:53 on 169.254.25.10                                                                                                                            │
│ ip6.arpa.:53 on 169.254.25.10                                                                                                                                │
│ .:53 on 169.254.25.10                                                                                                                                        │
│ [INFO] plugin/reload: Running configuration MD5 = adf97d6b4504ff12113ebb35f0c6413e                                                                           │
│ CoreDNS-1.7.0                                                                                                                                                │
│ linux/arm64, go1.13.15,                                                                                                                                      │
│ [INFO] Added back nodelocaldns rule - {raw PREROUTING [-p tcp -d 169.254.25.10 --dport 53 -j NOTRACK]}                                                       │
│ [INFO] Added back nodelocaldns rule - {raw PREROUTING [-p udp -d 169.254.25.10 --dport 53 -j NOTRACK]}                                                       │
│ [INFO] Added back nodelocaldns rule - {filter INPUT [-p tcp -d 169.254.25.10 --dport 53 -j ACCEPT]}                                                          │
│ [INFO] Added back nodelocaldns rule - {filter INPUT [-p udp -d 169.254.25.10 --dport 53 -j ACCEPT]}                                                          │
│ [INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -s 169.254.25.10 --sport 53 -j NOTRACK]}                                                           │
│ [INFO] Added back nodelocaldns rule - {raw OUTPUT [-p udp -s 169.254.25.10 --sport 53 -j NOTRACK]}                                                           │
│ [INFO] Added back nodelocaldns rule - {filter OUTPUT [-p tcp -s 169.254.25.10 --sport 53 -j ACCEPT]}                                                         │
│ [INFO] Added back nodelocaldns rule - {filter OUTPUT [-p udp -s 169.254.25.10 --sport 53 -j ACCEPT]}                                                         │
│ [INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -d 169.254.25.10 --dport 53 -j NOTRACK]}                                                           │
│ [INFO] Added back nodelocaldns rule - {raw OUTPUT [-p udp -d 169.254.25.10 --dport 53 -j NOTRACK]}                                                           │
│ [INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -d 169.254.25.10 --dport 8080 -j NOTRACK]}                                                         │
│ [INFO] Added back nodelocaldns rule - {raw OUTPUT [-p tcp -s 169.254.25.10 --sport 8080 -j NOTRACK]}
| [INFO] SIGTERM: Shutting down servers then terminating                                                                                                       │
│ [INFO] Tearing down                                                                                                                                          │
│ [WARNING] Exiting iptables/interface check goroutine

jakoberpf avatar Jul 14 '22 06:07 jakoberpf

Seems that this is realated to these issues of the localdnscache

https://github.com/kubernetes/dns/issues/453 https://github.com/kubernetes/dns/issues/394

jakoberpf avatar Jul 15 '22 10:07 jakoberpf

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Oct 13 '22 11:10 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Nov 12 '22 12:11 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Dec 12 '22 12:12 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Dec 12 '22 12:12 k8s-ci-robot