k3s Allow manual override or scaling of CoreDNS replica count

Is your feature request related to a problem? Please describe. When deploying a multi-node cluster, CoreDNS is hard-coded with 1 replica, this can introduce concerns for HA, scalability and cluster DNS outages during deployment upgrades.

Describe the solution you'd like Allow mitigation of this by deploying or scaling CoreDNS with a minimum of 2 replicas.

Describe alternatives you've considered Manually deploying CoreDNS or coredns-autoscaler with min: 2

Apr 02 '20 04:04 dkeightley

Defining topologySpreadConstraints and pod disruption budget on the coredns manifest would be a good start: https://github.com/k3s-io/k3s/blob/58315fe135101f1a06bf439687c2be9da692648f/manifests/coredns.yaml

Oct 02 '21 01:10 marcbachmann

Thanks @marcbachmann, potentially another option is the preventSinglePointFailure with the autoscaler. This does add another deployment however.

Oct 02 '21 04:10 dkeightley

Currently evaluating k3s to replace our kubeadm setup. So far i found mostly problems regarding high-availability. This issue is the most prominent to me because when the node with the DNS pod goes down, the whole cluster basically becomes useless for our workloads.

I couldn't find any open configuration so far. Did anyone work on a HA setup for k3s that took into account when we controller-node goes down? Is this even of interest to the project? I am wondering because this issue has been open for a while and it seems necessary for a multi-controller setup to me? Are patches welcome for this?

Jul 20 '22 18:07 zimmski

K3s doesn't offer much flexibility in the coredns configuration at the moment, since we ship a flat manifest and not a HelmChart that can be customized. I think the most common thing for folks to do is copy the packaged CoreDNS manifest, make their changes, and then restart k3s with --disable=coredns so that only their modified configuration is used.

Jul 20 '22 18:07 brandond

@zimmski one option is to leave defaults in place, and add a manifest file containing the coredns-autoscaler deployment to manage the scaling of coredns automatically.

https://kubernetes.io/docs/tasks/administer-cluster/dns-horizontal-autoscaling/#enablng-dns-horizontal-autoscaling

As an example, I've used this before with k3s.

Jul 20 '22 19:07 dkeightley

Thanks for your suggestions. An auto-scaler is i think the best option. What i am curious and want to understand is why k3s does have this HA changes in the first place. Is this a matter of missing contributions, or should the default of the project be to use as few resources as possible?

Jul 21 '22 07:07 zimmski

why k3s does have this HA changes in the first place

I'm not sure what you're asking here.

should the default of the project be to use as few resources as possible

Yes, as the project readme says the goal of K3s is to be:

Lightweight Kubernetes. Production ready, easy to install, half the memory, all in a binary less than 100 MB.

Jul 21 '22 18:07 brandond

I've lookup failures in my 3 node cluster which can be resolved by changing replicas to 3. So why not use a daemonset for coredns? Or do I miss something?

Nov 30 '22 13:11 brightdroid

Running coredns on every node would incur unnecessary overhead on a distro that is focused on resource-constrained nodes. If you are experiencing DNS failures when the coredns pod is not running on the same node as your workload, you are most likely experiencing drops in CNI traffic between nodes. This is commonly caused by blocked vxlan ports, or issues with tx checksum offload corrupting packets. I would recommend fixing that, rather than just scheduling more replicas.

Nov 30 '22 18:11 brandond

Makes sense, I checked all interfaces on all nodes and cni0 has no drops or errors. But I've seen output discards on flannel.1 on one node, every full hour. Maybe this bug? https://github.com/flannel-io/flannel/issues/1009

Could you please help me here?

Dec 02 '22 16:12 brightdroid

@brightdroid please open another issue to track your problem; lets keep this one focused on the initial ask of being able to set the replica count.

Dec 05 '22 21:12 brandond

As of https://github.com/k3s-io/k3s/pull/6552 none of our packaged manifests should specify a replica count, so you should be able to scale it without having it reset when the servers restart. I see that the coredns replica count was commented out a while back, so I suspect that this has actually been resolved for a while.

Dec 05 '22 21:12 brandond

Validated on release-1.25 branch with commit `457e5e7379821db3feed65548fb7678345a73828` and master branch with commit `b5d39df9294627cbfa3081acb92e2be54f02b0d6`

Environment Details

Infrastructure

[x] Cloud (AWS)
[ ] Hosted Node(s) CPU architecture, OS, and Version:

Ubuntu 22.04 LTS Cluster Configuration:

3 servers, 1 agent Config.yaml:

N/A Additional files

N/A

Testing Steps

Install k3s and join all nodes
Scale and edit local-path-provisioner and coredns
Reboot the nodes or restart the k3s service on all the nodes Replication Results:

local-path-provisioner updated upon restart to only have 1 replica
coredns kept its replicas, but maintained revision history so there would be multiple replicasets

$ k get deploy,rs -n kube-system -o wide
NAME                                     READY   UP-TO-DATE   AVAILABLE   AGE    CONTAINERS               IMAGES                                   SELECTOR
deployment.apps/coredns                  2/2     2            2           163m   coredns                  rancher/mirrored-coredns-coredns:1.9.4   k8s-app=kube-dns
deployment.apps/local-path-provisioner   1/1     1            1           163m   local-path-provisioner   rancher/local-path-provisioner:v0.0.23   app=local-path-provisioner
deployment.apps/metrics-server           1/1     1            1           163m   metrics-server           rancher/mirrored-metrics-server:v0.6.1   k8s-app=metrics-server
deployment.apps/traefik                  1/1     1            1           162m   traefik                  rancher/mirrored-library-traefik:2.9.4   app.kubernetes.io/instance=traefik-kube-system,app.kubernetes.io/name=traefik

NAME                                                DESIRED   CURRENT   READY   AGE    CONTAINERS               IMAGES                                   SELECTOR
replicaset.apps/coredns-57557ff85b                  0         0         0       44m    coredns                  rancher/mirrored-coredns-coredns:1.9.4   k8s-app=kube-dns,pod-template-hash=57557ff85b
replicaset.apps/coredns-597584b69b                  0         0         0       163m   coredns                  rancher/mirrored-coredns-coredns:1.9.4   k8s-app=kube-dns,pod-template-hash=597584b69b
replicaset.apps/coredns-9996b5795                   2         2         2       39m    coredns                  rancher/mirrored-coredns-coredns:1.9.4   k8s-app=kube-dns,pod-template-hash=9996b5795
replicaset.apps/local-path-provisioner-79f67d76f8   1         1         1       163m   local-path-provisioner   rancher/local-path-provisioner:v0.0.23   app=local-path-provisioner,pod-template-hash=79f67d76f8
replicaset.apps/metrics-server-5c8978b444           1         1         1       163m   metrics-server           rancher/mirrored-metrics-server:v0.6.1   k8s-app=metrics-server,pod-template-hash=5c8978b444
replicaset.apps/traefik-bb69b68cd                   1         1         1       162m   traefik                  rancher/mirrored-library-traefik:2.9.4   app.kubernetes.io/instance=traefik-kube-system,app.kubernetes.io/name=traefik,pod-template-hash=bb69b68cd

Validation Results:

local-path-provisioner maintained the data from the scale and edit (2 replicas in my case)
coredns kept its replicas AND there were no empty replicasets present:

$ k get deploy,rs -n kube-system -o wide
NAME                                     READY   UP-TO-DATE   AVAILABLE   AGE    CONTAINERS               IMAGES                                   SELECTOR
deployment.apps/coredns                  2/2     2            2           162m   coredns                  rancher/mirrored-coredns-coredns:1.9.4   k8s-app=kube-dns
deployment.apps/local-path-provisioner   2/2     2            2           162m   local-path-provisioner   rancher/local-path-provisioner:v0.0.23   app=local-path-provisioner
deployment.apps/metrics-server           1/1     1            1           162m   metrics-server           rancher/mirrored-metrics-server:v0.6.2   k8s-app=metrics-server
deployment.apps/traefik                  1/1     1            1           161m   traefik                  rancher/mirrored-library-traefik:2.9.4   app.kubernetes.io/instance=traefik-kube-system,app.kubernetes.io/name=traefik

NAME                                                DESIRED   CURRENT   READY   AGE    CONTAINERS               IMAGES                                   SELECTOR
replicaset.apps/coredns-9996b5795                   2         2         2       26m    coredns                  rancher/mirrored-coredns-coredns:1.9.4   k8s-app=kube-dns,pod-template-hash=9996b5795
replicaset.apps/local-path-provisioner-79f67d76f8   2         2         2       162m   local-path-provisioner   rancher/local-path-provisioner:v0.0.23   app=local-path-provisioner,pod-template-hash=79f67d76f8
replicaset.apps/metrics-server-5f9f776df5           1         1         1       162m   metrics-server           rancher/mirrored-metrics-server:v0.6.2   k8s-app=metrics-server,pod-template-hash=5f9f776df5
replicaset.apps/traefik-66c46d954f                  1         1         1       161m   traefik                  rancher/mirrored-library-traefik:2.9.4   app.kubernetes.io/instance=traefik-kube-system,app.kubernetes.io/name=traefik,pod-template-hash=66c46d954f

Dec 08 '22 22:12 rancher-max

k3s k3s copied to clipboard

Allow manual override or scaling of CoreDNS replica count

Validated on release-1.25 branch with commit 457e5e7379821db3feed65548fb7678345a73828 and master branch with commit b5d39df9294627cbfa3081acb92e2be54f02b0d6

Environment Details

Testing Steps

k3s
k3s copied to clipboard

Validated on release-1.25 branch with commit `457e5e7379821db3feed65548fb7678345a73828` and master branch with commit `b5d39df9294627cbfa3081acb92e2be54f02b0d6`