helm
helm copied to clipboard
I/O timeout Error after installation
Hello all,
I have fresh install of kubeadm cluster
kubeadm version: &version.Info{Major:"1", Minor:"29", GitVersion:"v1.29.3", GitCommit:"6813625b7cd706db5bc7388921be03071e1a492d", GitTreeState:"clean", BuildDate:"2024-03-15T00:06:16Z", GoVersion:"go1.21.8", Compiler:"gc", Platform:"linux/amd64"}
I'm trying to control coredns with this chart helm, here are actions performed:
kubectl -n kube-system annotate --overwrite ConfigMap coredns meta.helm.sh/release-name=coredns;
kubectl -n kube-system annotate --overwrite ConfigMap coredns meta.helm.sh/release-namespace=kube-system;
kubectl -n kube-system label --overwrite ConfigMap coredns app.kubernetes.io/managed-by=Helm;
kubectl -n kube-system annotate --overwrite Deployment coredns meta.helm.sh/release-name=coredns;
kubectl -n kube-system annotate --overwrite Deployment coredns meta.helm.sh/release-namespace=kube-system;
kubectl -n kube-system label --overwrite Deployment coredns app.kubernetes.io/managed-by=Helm;
kubectl -n kube-system annotate --overwrite Service kube-dns meta.helm.sh/release-name=coredns;
kubectl -n kube-system annotate --overwrite Service kube-dns meta.helm.sh/release-namespace=kube-system;
kubectl -n kube-system label --overwrite Service kube-dns app.kubernetes.io/managed-by=Helm;
kubectl -n kube-system annotate --overwrite ServiceAccount coredns meta.helm.sh/release-name=coredns;
kubectl -n kube-system annotate --overwrite ServiceAccount coredns meta.helm.sh/release-namespace=kube-system;
kubectl -n kube-system label --overwrite ServiceAccount coredns app.kubernetes.io/managed-by=Helm;
Here are values:
autoscaler:
enabled: true
includeUnschedulableNodes: true
min: 2
resources:
limits:
memory: 20Mi
extraVolumes:
- name: config-custom
configMap:
name: coredns-custom
extraVolumeMounts:
- name: config-custom
mountPath: /etc/coredns/custom
k8sAppLabelOverride: kube-dns
service:
name: kube-dns
serviceAccount:
name: coredns
servers:
- zones:
- zone: .
port: 53
plugins:
- name: cache
parameters: 30
- name: errors
- name: forward
parameters: . /etc/resolv.conf
configBlock: |-
max_concurrent 1000
- name: loadbalance
- name: log
- name: loop
- name: prometheus
parameters: 0.0.0.0:9153
- name: ready
- name: reload
- name: health
configBlock: |-
lameduck 5s
- name: kubernetes
parameters: cluster.internal in-addr.arpa ip6.arpa
configBlock: |-
fallthrough in-addr.arpa ip6.arpa
pods insecure
ttl 30
- name: import
parameters: custom/*.override
extraConfig:
import:
parameters: custom/*.server
After install, i got this error:
Get "http://source-controller.kube-system.svc.cluster.internal./helmchart/helm-data/helm-data-tigera-operator/tigera-operator-v3.27.2.tgz": dial tcp: lookup source-controller.kube-system.svc.cluster.internal. on 10.0.0.10:53: read udp 10.85.0.4:45294->10.0.0.10:53: i/o timeout
kubectl run -it --rm busybox --image busybox /bin/sh
/ # nslookup kube-dns.kube-system.svc.cluster.internal
;; connection timed out; no servers could be reached
/ # nslookup google.com
;; connection timed out; no servers could be reached
Does anyone have an idea to resolve this?
What endpoints does the kube-dns service have?
Thank you for your quick response, here are the 2 yaml, before installation -> after installation
---
# Before:
apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: coredns
meta.helm.sh/release-namespace: kube-system
prometheus.io/port: "9153"
prometheus.io/scrape: "true"
creationTimestamp: "2024-03-21T10:31:05Z"
labels:
app.kubernetes.io/managed-by: Helm
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
kubernetes.io/name: CoreDNS
name: kube-dns
namespace: kube-system
resourceVersion: "684"
uid: d9cf0902-5028-49ea-b4d4-7f6093c2f0d1
spec:
clusterIP: 10.0.0.10
clusterIPs:
- 10.0.0.10
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: dns
port: 53
protocol: UDP
targetPort: 53
- name: dns-tcp
port: 53
protocol: TCP
targetPort: 53
- name: metrics
port: 9153
protocol: TCP
targetPort: 9153
selector:
k8s-app: kube-dns
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
---
# After:
apiVersion: v1
kind: Service
metadata:
annotations:
meta.helm.sh/release-name: coredns
meta.helm.sh/release-namespace: kube-system
prometheus.io/port: "9153"
prometheus.io/scrape: "true"
creationTimestamp: "2024-03-21T10:31:05Z"
labels:
app.kubernetes.io/instance: coredns
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: coredns
helm.sh/chart: coredns-1.29.0
helm.toolkit.fluxcd.io/name: coredns
helm.toolkit.fluxcd.io/namespace: helm-data
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
kubernetes.io/name: CoreDNS
name: kube-dns
namespace: kube-system
resourceVersion: "3204"
uid: d9cf0902-5028-49ea-b4d4-7f6093c2f0d1
spec:
clusterIP: 10.0.0.10
clusterIPs:
- 10.0.0.10
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: udp-53
port: 53
protocol: UDP
targetPort: 53
- name: dns-tcp
port: 53
protocol: TCP
targetPort: 53
- name: metrics
port: 9153
protocol: TCP
targetPort: 9153
selector:
app.kubernetes.io/instance: coredns
app.kubernetes.io/name: coredns
k8s-app: kube-dns
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
---
I was referring to the endpoints, not the service spec.
Try something like: kubectl get ep -n kube-system kube-dns and see if it matches the coredns pod IP addresses
Oh sorry, it seems to match
---
# Endpoint
- addresses:
- ip: 10.85.0.11
nodeName: main
targetRef:
kind: Pod
name: coredns-5fc4d8d869-vs428
namespace: kube-system
- ip: 10.85.0.8
nodeName: worker-001
targetRef:
kind: Pod
name: coredns-5fc4d8d869-cdtzr
namespace: kube-system
ports:
- name: dns-tcp
port: 53
protocol: TCP
- name: metrics
port: 9153
protocol: TCP
- name: udp-53
port: 53
protocol: UDP
---
metadata:
name: coredns-5fc4d8d869-vs428
podIP: 10.85.0.11
podIPs:
- ip: 10.85.0.11
---
metadata:
name: coredns-5fc4d8d869-cdtzr
podIP: 10.85.0.8
podIPs:
- ip: 10.85.0.8
---
Edit1: I have the impression that the error does not come from coredns, but from the tigera-calico installation. I'm looking, can we pause this issue, I'll close it if so.
I have something similar to AWS, I carried out the installation, and sometimes I start to experience timeouts, did you manage to find out anything?
[ERROR] plugin/errors: 2 metadata.google.internal. AAAA: read udp 10.72.35.115:45212->10.72.32.2:53: i/o timeout