k0sctl
k0sctl copied to clipboard
CoreDNS is not being deployed with dynamicConfig disabled
Before creating an issue, make sure you've checked the following:
- [X] You are running the latest released version of k0s
- [X] Make sure you've searched for existing issues, both open and closed
- [X] Make sure you've searched for PRs too, a fix might've been merged already
- [X] You're looking at docs for the released version, "main" branch docs are usually ahead of released versions.
Platform
Linux 5.15.142-flatcar #1 SMP Mon Dec 11 21:37:48 -00 2023 x86_64 GNU/Linux
NAME="Flatcar Container Linux by Kinvolk"
ID=flatcar
ID_LIKE=coreos
VERSION=3602.2.3
VERSION_ID=3602.2.3
BUILD_ID=2023-12-11-2204
SYSEXT_LEVEL=1.0
PRETTY_NAME="Flatcar Container Linux by Kinvolk 3602.2.3 (Oklo)"
ANSI_COLOR="38;5;75"
HOME_URL="https://flatcar.org/"
BUG_REPORT_URL="https://issues.flatcar.org"
FLATCAR_BOARD="amd64-usr"
CPE_NAME="cpe:2.3:o:flatcar-linux:flatcar_linux:3602.2.3:*:*:*:*:*:*:*"
Version
k0s 1.28.4 k0s 1.27.8
k0sctl version version: v0.16.0 commit: 7e8c272
Sysinfo
`k0s sysinfo`
Machine ID: "555fbefc839e690070cea6790c165890ed90f324fd3d148c6003df4bc94402fd" (from machine) (pass)
Total memory: 3.8 GiB (pass)
Disk space available for /var/lib/k0s: 42.0 GiB (pass)
Name resolution: localhost: [::1 127.0.0.1] (pass)
Operating system: Linux (pass)
Linux kernel release: 5.15.142-flatcar (pass)
Max. file descriptors per process: current: 524288 / max: 524288 (pass)
AppArmor: unavailable (pass)
Executable in PATH: modprobe: /usr/sbin/modprobe (pass)
Executable in PATH: mount: /usr/bin/mount (pass)
Executable in PATH: umount: /usr/bin/umount (pass)
/proc file system: mounted (0x9fa0) (pass)
Control Groups: version 2 (pass)
cgroup controller "cpu": available (pass)
cgroup controller "cpuacct": available (via cpu in version 2) (pass)
cgroup controller "cpuset": available (pass)
cgroup controller "memory": available (pass)
cgroup controller "devices": available (assumed) (pass)
cgroup controller "freezer": available (assumed) (pass)
cgroup controller "pids": available (pass)
cgroup controller "hugetlb": available (pass)
cgroup controller "blkio": available (via io in version 2) (pass)
CONFIG_CGROUPS: Control Group support: built-in (pass)
CONFIG_CGROUP_FREEZER: Freezer cgroup subsystem: built-in (pass)
CONFIG_CGROUP_PIDS: PIDs cgroup subsystem: built-in (pass)
CONFIG_CGROUP_DEVICE: Device controller for cgroups: built-in (pass)
CONFIG_CPUSETS: Cpuset support: built-in (pass)
CONFIG_CGROUP_CPUACCT: Simple CPU accounting cgroup subsystem: built-in (pass)
CONFIG_MEMCG: Memory Resource Controller for Control Groups: built-in (pass)
CONFIG_CGROUP_HUGETLB: HugeTLB Resource Controller for Control Groups: built-in (pass)
CONFIG_CGROUP_SCHED: Group CPU scheduler: built-in (pass)
CONFIG_FAIR_GROUP_SCHED: Group scheduling for SCHED_OTHER: built-in (pass)
CONFIG_CFS_BANDWIDTH: CPU bandwidth provisioning for FAIR_GROUP_SCHED: built-in (pass)
CONFIG_BLK_CGROUP: Block IO controller: built-in (pass)
CONFIG_NAMESPACES: Namespaces support: built-in (pass)
CONFIG_UTS_NS: UTS namespace: built-in (pass)
CONFIG_IPC_NS: IPC namespace: built-in (pass)
CONFIG_PID_NS: PID namespace: built-in (pass)
CONFIG_NET_NS: Network namespace: built-in (pass)
CONFIG_NET: Networking support: built-in (pass)
CONFIG_INET: TCP/IP networking: built-in (pass)
CONFIG_IPV6: The IPv6 protocol: built-in (pass)
CONFIG_NETFILTER: Network packet filtering framework (Netfilter): built-in (pass)
CONFIG_NETFILTER_ADVANCED: Advanced netfilter configuration: built-in (pass)
CONFIG_NF_CONNTRACK: Netfilter connection tracking support: module (pass)
CONFIG_NETFILTER_XTABLES: Netfilter Xtables support: built-in (pass)
CONFIG_NETFILTER_XT_TARGET_REDIRECT: REDIRECT target support: module (pass)
CONFIG_NETFILTER_XT_MATCH_COMMENT: "comment" match support: module (pass)
CONFIG_NETFILTER_XT_MARK: nfmark target and match support: module (pass)
CONFIG_NETFILTER_XT_SET: set target and match support: module (pass)
CONFIG_NETFILTER_XT_TARGET_MASQUERADE: MASQUERADE target support: module (pass)
CONFIG_NETFILTER_XT_NAT: "SNAT and DNAT" targets support: module (pass)
CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: "addrtype" address type match support: module (pass)
CONFIG_NETFILTER_XT_MATCH_CONNTRACK: "conntrack" connection tracking match support: module (pass)
CONFIG_NETFILTER_XT_MATCH_MULTIPORT: "multiport" Multiple port match support: module (pass)
CONFIG_NETFILTER_XT_MATCH_RECENT: "recent" match support: module (pass)
CONFIG_NETFILTER_XT_MATCH_STATISTIC: "statistic" match support: module (pass)
CONFIG_NETFILTER_NETLINK: module (pass)
CONFIG_NF_NAT: module (pass)
CONFIG_IP_SET: IP set support: module (pass)
CONFIG_IP_SET_HASH_IP: hash:ip set support: module (pass)
CONFIG_IP_SET_HASH_NET: hash:net set support: module (pass)
CONFIG_IP_VS: IP virtual server support: module (pass)
CONFIG_IP_VS_NFCT: Netfilter connection tracking: built-in (pass)
CONFIG_IP_VS_SH: Source hashing scheduling: module (pass)
CONFIG_IP_VS_RR: Round-robin scheduling: module (pass)
CONFIG_IP_VS_WRR: Weighted round-robin scheduling: module (pass)
CONFIG_NF_CONNTRACK_IPV4: IPv4 connetion tracking support (required for NAT): unknown (warning)
CONFIG_NF_REJECT_IPV4: IPv4 packet rejection: module (pass)
CONFIG_NF_NAT_IPV4: IPv4 NAT: unknown (warning)
CONFIG_IP_NF_IPTABLES: IP tables support: built-in (pass)
CONFIG_IP_NF_FILTER: Packet filtering: module (pass)
CONFIG_IP_NF_TARGET_REJECT: REJECT target support: module (pass)
CONFIG_IP_NF_NAT: iptables NAT support: module (pass)
CONFIG_IP_NF_MANGLE: Packet mangling: module (pass)
CONFIG_NF_DEFRAG_IPV4: module (pass)
CONFIG_NF_CONNTRACK_IPV6: IPv6 connetion tracking support (required for NAT): unknown (warning)
CONFIG_NF_NAT_IPV6: IPv6 NAT: unknown (warning)
CONFIG_IP6_NF_IPTABLES: IP6 tables support: module (pass)
CONFIG_IP6_NF_FILTER: Packet filtering: module (pass)
CONFIG_IP6_NF_MANGLE: Packet mangling: module (pass)
CONFIG_IP6_NF_NAT: ip6tables NAT support: module (pass)
CONFIG_NF_DEFRAG_IPV6: module (pass)
CONFIG_BRIDGE: 802.1d Ethernet Bridging: module (pass)
CONFIG_LLC: module (pass)
CONFIG_STP: module (pass)
CONFIG_EXT4_FS: The Extended 4 (ext4) filesystem: module (pass)
CONFIG_PROC_FS: /proc file system support: built-in (pass)
What happened?
When installing a fresh cluster, I noticed that the CoreDNS pods are not being deployed. I SSH'd into each controller node and verified that the manifests exist under /var/lib/k0s/manifests/coredns/
$ ls -lhA /var/lib/k0s/manifests/coredns/
total 8.0K
-rw-r--r--. 1 root root 4.4K Dec 23 12:47 coredns.yaml
And if I manually apply this, it spins up fine. This issue coincided with me disabling dynamicConfig though (which I did because it failed to handle helm deployments properly - updates to the list of deployments didn't take effect). Upon resetting and enabling dynamicConfig again, they spin up fine.
I verified this behaviour across both 1.28.4 and 1.27.8. I also attempted waiting 12 hours to see if it was just delayed somehow - but CoreDNS did not deploy.
Pods with dynamicConfig: false:
NAMESPACE NAME READY STATUS RESTARTS AGE
k0s-system k0s-pushgateway-6c5d8c54cf-khmbv 1/1 Running 0 2m34s
kube-system calico-kube-controllers-84c6cd5b85-mjllc 1/1 Running 0 2m34s
kube-system calico-node-2z2nk 1/1 Running 0 2m27s
kube-system calico-node-4q2t8 1/1 Running 0 2m22s
kube-system calico-node-7mhgb 1/1 Running 0 2m22s
kube-system calico-node-hxwcq 1/1 Running 0 2m27s
kube-system konnectivity-agent-7tdwc 1/1 Running 0 2m22s
kube-system konnectivity-agent-kxkkn 1/1 Running 0 2m27s
kube-system konnectivity-agent-p8p9b 1/1 Running 0 2m27s
kube-system konnectivity-agent-xdbf2 1/1 Running 0 2m22s
kube-system kube-proxy-49756 1/1 Running 0 2m27s
kube-system kube-proxy-k2sdt 1/1 Running 0 2m27s
kube-system kube-proxy-kzpt5 1/1 Running 0 2m22s
kube-system kube-proxy-vrzxd 1/1 Running 0 2m22s
kube-system metrics-server-7556957bb7-88w9c 1/1 Running 0 2m34s
kube-system nllb-sehar01-dev01-w01 1/1 Running 0 76s
kube-system nllb-sehar01-dev01-w02 1/1 Running 0 79s
kube-system nllb-sehar01-dev01-w03 1/1 Running 0 72s
kube-system nllb-sehar01-dev01-w04 1/1 Running 0 79s
metallb metallb-controller-5f9bb77dcd-kzp9j 1/1 Running 0 2m34s
metallb metallb-speaker-46lhm 4/4 Running 0 2m15s
metallb metallb-speaker-dv2g6 4/4 Running 0 2m9s
metallb metallb-speaker-fp5nz 4/4 Running 0 2m22s
metallb metallb-speaker-xk89s 4/4 Running 0 2m20s
Pods with dynamicConfig: true:
NAMESPACE NAME READY STATUS RESTARTS AGE
k0s-system k0s-pushgateway-6c5d8c54cf-qrtfd 1/1 Running 0 2m28s
kube-system calico-kube-controllers-84c6cd5b85-7tn46 1/1 Running 0 2m28s
kube-system calico-node-7xmqb 1/1 Running 0 2m10s
kube-system calico-node-9qwgt 1/1 Running 0 2m10s
kube-system calico-node-r2lcc 1/1 Running 0 2m10s
kube-system calico-node-vn2s4 1/1 Running 0 2m5s
kube-system coredns-85df575cdb-66wcw 1/1 Running 0 2m28s
kube-system coredns-85df575cdb-d2vh2 1/1 Running 0 2m2s
kube-system konnectivity-agent-dldcd 1/1 Running 0 2m10s
kube-system konnectivity-agent-qlm44 1/1 Running 0 2m10s
kube-system konnectivity-agent-sgxrb 1/1 Running 0 2m5s
kube-system konnectivity-agent-wt8mf 1/1 Running 0 2m10s
kube-system kube-proxy-8cz6f 1/1 Running 0 2m10s
kube-system kube-proxy-f2qqj 1/1 Running 0 2m5s
kube-system kube-proxy-hcnlj 1/1 Running 0 2m10s
kube-system kube-proxy-kphxv 1/1 Running 0 2m10s
kube-system metrics-server-7556957bb7-4pxwv 1/1 Running 0 2m20s
kube-system nllb-sehar01-dev01-w01 1/1 Running 0 48s
kube-system nllb-sehar01-dev01-w02 1/1 Running 0 58s
kube-system nllb-sehar01-dev01-w03 1/1 Running 0 57s
kube-system nllb-sehar01-dev01-w04 1/1 Running 0 40s
metallb metallb-controller-5f9bb77dcd-8vfzk 1/1 Running 0 2m27s
metallb metallb-speaker-9r2zt 4/4 Running 0 2m5s
metallb metallb-speaker-f78ft 4/4 Running 0 2m3s
metallb metallb-speaker-qdlp9 4/4 Running 0 2m1s
metallb metallb-speaker-sfbjp 4/4 Running 0 117s
Steps to reproduce
- Install a k0s cluster with
dynamicConfig: false
Expected behavior
CoreDNS should spin up.
Actual behavior
CoreDNS does not deploy to the cluster.
Additional context
k0sctl.yaml
apiVersion: k0sctl.k0sproject.io/v1beta1
kind: Cluster
metadata:
name: my-cluster-name
spec:
hosts:
- ssh:
address: 172.30.2.2
user: user
port: 22
keyPath: path
role: controller
privateInterface: ens192
installFlags:
- --enable-metrics-scraper
- ssh:
address: 172.30.2.3
user: user
port: 22
keyPath: path
role: controller
privateInterface: ens192
installFlags:
- --enable-metrics-scraper
- ssh:
address: 172.30.2.4
user: user
port: 22
keyPath: path
role: controller
privateInterface: ens192
installFlags:
- --enable-metrics-scraper
- ssh:
address: 172.30.2.130
user: user
port: 22
keyPath: path
role: worker
privateInterface: ens192
- ssh:
address: 172.30.2.131
user: user
port: 22
keyPath: path
role: worker
privateInterface: ens192
- ssh:
address: 172.30.2.132
user: user
port: 22
keyPath: path
role: worker
privateInterface: ens192
- ssh:
address: 172.30.2.133
user: user
port: 22
keyPath: path
role: worker
privateInterface: ens192
k0s:
version: 1.28.4+k0s.0
dynamicConfig: false
config:
spec:
extensions:
helm:
repositories:
- name: metallb
url: https://metallb.github.io/metallb
charts:
- name: metallb
chartname: metallb/metallb
namespace: metallb
order: 0
values: |
speaker:
logLevel: warn
network:
nodeLocalLoadBalancing:
enabled: true
provider: calico
calico:
envVars:
FELIX_FEATUREDETECTOVERRIDE: ChecksumOffloadBroken=true
Upon resetting and enabling dynamicConfig again, they spin up fine.
Did you reset the cluster before restarting it with dynamicConfig: false? Did you check the leading controller's logs about any errors when applying the CoreDNS stack? You should be able to re-trigger the application process without a controller restart by simply touching the coredns.yaml file.
it failed to handle helm deployments properly - updates to the list of deployments didn't take effect
Would you mind to file a separate issue about that?
@twz123 I did reset the cluster with each attempt. All the data I collected at the time is in the issue. I can find time to attempt to reproduce the issue within a week or so.
As for the other issue (related to helm) - since I switched off dynamicConfig, I haven't collected sufficient data to make a good error report.
I can find time to attempt to reproduce the issue within a week or so.
Cool!
As for the other issue (related to helm) - since I switched off dynamicConfig, I haven't collected sufficient data to make a good error report.
Alright. Feel free to file another issue whenever it occurs again.