Airgap deployment doesn't work
Before creating an issue, make sure you've checked the following:
- [x] You are running the latest released version of k0s
- [x] Make sure you've searched for existing issues, both open and closed
- [x] Make sure you've searched for PRs too, a fix might've been merged already
- [x] You're looking at docs for the released version, "main" branch docs are usually ahead of released versions.
Platform
Linux 5.4.0-144-generic #161-Ubuntu SMP Fri Feb 3 14:49:04 UTC 2023 x86_64 GNU/Linux NAME="Ubuntu" VERSION="20.04.5 LTS (Focal Fossa)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu-Server 20.04.5 v2.0 LTS (Cubic 2023-01-10 09:05)" VERSION_ID="20.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=focal UBUNTU_CODENAME=focal
Version
v1.30.4+k0s.0
Sysinfo
`k0s sysinfo`
Total memory: 62.5 GiB (pass)
Disk space available for /var/lib/k0s: 1.5 TiB (pass)
Name resolution: localhost: [127.0.0.1] (pass)
Operating system: Linux (pass)
Linux kernel release: 5.4.0-144-generic (pass)
Max. file descriptors per process: current: 1048576 / max: 1048576 (pass)
AppArmor: active (pass)
Executable in PATH: modprobe: /usr/sbin/modprobe (pass)
Executable in PATH: mount: /usr/bin/mount (pass)
Executable in PATH: umount: /usr/bin/umount (pass)
/proc file system: mounted (0x9fa0) (pass)
Control Groups: version 1 (pass)
cgroup controller "cpu": available (pass)
cgroup controller "cpuacct": available (pass)
cgroup controller "cpuset": available (pass)
cgroup controller "memory": available (pass)
cgroup controller "devices": available (pass)
cgroup controller "freezer": available (pass)
cgroup controller "pids": available (pass)
cgroup controller "hugetlb": available (pass)
cgroup controller "blkio": available (pass)
CONFIG_CGROUPS: Control Group support: built-in (pass)
CONFIG_CGROUP_FREEZER: Freezer cgroup subsystem: built-in (pass)
CONFIG_CGROUP_PIDS: PIDs cgroup subsystem: built-in (pass)
CONFIG_CGROUP_DEVICE: Device controller for cgroups: built-in (pass)
CONFIG_CPUSETS: Cpuset support: built-in (pass)
CONFIG_CGROUP_CPUACCT: Simple CPU accounting cgroup subsystem: built-in (pass)
CONFIG_MEMCG: Memory Resource Controller for Control Groups: built-in (pass)
CONFIG_CGROUP_HUGETLB: HugeTLB Resource Controller for Control Groups: built-in (pass)
CONFIG_CGROUP_SCHED: Group CPU scheduler: built-in (pass)
CONFIG_FAIR_GROUP_SCHED: Group scheduling for SCHED_OTHER: built-in (pass)
CONFIG_CFS_BANDWIDTH: CPU bandwidth provisioning for FAIR_GROUP_SCHED: built-in (pass)
CONFIG_BLK_CGROUP: Block IO controller: built-in (pass)
CONFIG_NAMESPACES: Namespaces support: built-in (pass)
CONFIG_UTS_NS: UTS namespace: built-in (pass)
CONFIG_IPC_NS: IPC namespace: built-in (pass)
CONFIG_PID_NS: PID namespace: built-in (pass)
CONFIG_NET_NS: Network namespace: built-in (pass)
CONFIG_NET: Networking support: built-in (pass)
CONFIG_INET: TCP/IP networking: built-in (pass)
CONFIG_IPV6: The IPv6 protocol: built-in (pass)
CONFIG_NETFILTER: Network packet filtering framework (Netfilter): built-in (pass)
CONFIG_NETFILTER_ADVANCED: Advanced netfilter configuration: built-in (pass)
CONFIG_NF_CONNTRACK: Netfilter connection tracking support: module (pass)
CONFIG_NETFILTER_XTABLES: Netfilter Xtables support: module (pass)
CONFIG_NETFILTER_XT_TARGET_REDIRECT: REDIRECT target support: module (pass)
CONFIG_NETFILTER_XT_MATCH_COMMENT: "comment" match support: module (pass)
CONFIG_NETFILTER_XT_MARK: nfmark target and match support: module (pass)
CONFIG_NETFILTER_XT_SET: set target and match support: module (pass)
CONFIG_NETFILTER_XT_TARGET_MASQUERADE: MASQUERADE target support: module (pass)
CONFIG_NETFILTER_XT_NAT: "SNAT and DNAT" targets support: module (pass)
CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: "addrtype" address type match support: module (pass)
CONFIG_NETFILTER_XT_MATCH_CONNTRACK: "conntrack" connection tracking match support: module (pass)
CONFIG_NETFILTER_XT_MATCH_MULTIPORT: "multiport" Multiple port match support: module (pass)
CONFIG_NETFILTER_XT_MATCH_RECENT: "recent" match support: module (pass)
CONFIG_NETFILTER_XT_MATCH_STATISTIC: "statistic" match support: module (pass)
CONFIG_NETFILTER_NETLINK: module (pass)
CONFIG_NF_NAT: module (pass)
CONFIG_IP_SET: IP set support: module (pass)
CONFIG_IP_SET_HASH_IP: hash:ip set support: module (pass)
CONFIG_IP_SET_HASH_NET: hash:net set support: module (pass)
CONFIG_IP_VS: IP virtual server support: module (pass)
CONFIG_IP_VS_NFCT: Netfilter connection tracking: built-in (pass)
CONFIG_IP_VS_SH: Source hashing scheduling: module (pass)
CONFIG_IP_VS_RR: Round-robin scheduling: module (pass)
CONFIG_IP_VS_WRR: Weighted round-robin scheduling: module (pass)
CONFIG_NF_CONNTRACK_IPV4: IPv4 connetion tracking support (required for NAT): unknown (warning)
CONFIG_NF_REJECT_IPV4: IPv4 packet rejection: module (pass)
CONFIG_NF_NAT_IPV4: IPv4 NAT: unknown (warning)
CONFIG_IP_NF_IPTABLES: IP tables support: module (pass)
CONFIG_IP_NF_FILTER: Packet filtering: module (pass)
CONFIG_IP_NF_TARGET_REJECT: REJECT target support: module (pass)
CONFIG_IP_NF_NAT: iptables NAT support: module (pass)
CONFIG_IP_NF_MANGLE: Packet mangling: module (pass)
CONFIG_NF_DEFRAG_IPV4: module (pass)
CONFIG_NF_CONNTRACK_IPV6: IPv6 connetion tracking support (required for NAT): unknown (warning)
CONFIG_NF_NAT_IPV6: IPv6 NAT: unknown (warning)
CONFIG_IP6_NF_IPTABLES: IP6 tables support: module (pass)
CONFIG_IP6_NF_FILTER: Packet filtering: module (pass)
CONFIG_IP6_NF_MANGLE: Packet mangling: module (pass)
CONFIG_IP6_NF_NAT: ip6tables NAT support: module (pass)
CONFIG_NF_DEFRAG_IPV6: module (pass)
CONFIG_BRIDGE: 802.1d Ethernet Bridging: module (pass)
CONFIG_LLC: module (pass)
CONFIG_STP: module (pass)
CONFIG_EXT4_FS: The Extended 4 (ext4) filesystem: built-in (pass)
CONFIG_PROC_FS: /proc file system support: built-in (pass)
What happened?
On new installation of v1.30.4+k0s.0 using airgap bundle but I get an error at cluster startup:
$ k0s kubectl get pods -Aw
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-688fb7db9f-nfjkm 0/1 Pending 0 36m
kube-system calico-node-p9v5d 0/1 Init:0/1 0 36m
kube-system coredns-74f779ff84-g8vsp 0/1 Pending 0 36m
kube-system kube-proxy-69nlm 0/1 ContainerCreating 0 36m
kube-system metrics-server-5cc4f44b94-nwb7f 0/1 Pending 0 36m
$ k0s kubectl -n kube-system describe pod calico-node-p9v5d
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 35m default-scheduler Successfully assigned kube-system/calico-node-p9v5d to platform-airgap-ci-01
Warning DNSConfigForming 15m (x29 over 35m) kubelet Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 8.8.8.8 8.8.4.4 192.168.90.251
Warning FailedCreatePodSandBox 5m13s (x24 over 33m) kubelet Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = failed to get sandbox image "registry.k8s.io/pause:3.8": failed to pull image "registry.k8s.io/pause:3.8": failed to pull and unpack image "registry.k8s.io/pause:3.8": failed to resolve reference "registry.k8s.io/pause:3.8": failed to do request: Head "https://registry.k8s.io/v2/pause/manifests/3.8": dial tcp 34.96.108.209:443: i/o timeout
Warning FailedCreatePodSandBox 11s (x24 over 35m) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox image "registry.k8s.io/pause:3.8": failed to pull image "registry.k8s.io/pause:3.8": failed to pull and unpack image "registry.k8s.io/pause:3.8": failed to resolve reference "registry.k8s.io/pause:3.8": failed to do request: Head "https://registry.k8s.io/v2/pause/manifests/3.8": dial tcp 34.96.108.209:443: i/o timeout
Container registry.k8s.io/pause:3.8 is not present on airgap bundle from release:
$ cat index.json | jq
{
"schemaVersion": 2,
"manifests": [
...
{
"mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
"digest": "sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097",
"size": 2405,
"annotations": {
"io.containerd.image.name": "registry.k8s.io/pause:3.9",
"org.opencontainers.image.ref.name": "3.9"
}
},
...
]
}
Steps to reproduce
- Perform an airgap installation with the airgap bundle available
v1.30.4+k0s.0in single node mode
Expected behavior
Container registry.k8s.io/pause:3.8 should be present in airgap-images-list.txt of version v1.30.4+k0s.0 on releases.
Actual behavior
Container registry.k8s.io/pause:3.8 is not present in airgap-images-list.txt of version v1.30.4+k0s.0 on releases.
Screenshots and logs
No response
Additional context
No response
I made an update from v1.23.17+k0s.0 to v1.30.4+k0s.0
Wow, that's brave. Updates are only supported from minor version to minor version, basically along the lines of what the Kubernetes Version Skew Policy mandates.
Container
registry.k8s.io/pause:3.8should be present in airgap-images-list.txt of versionv1.30.4+k0s.0on releases.
I think you may be suffering from hardcoded image versions in the k0s configuration. Can you check if you've explicitly specified the images in the k0s config? Unless you're using a private registry or something, it's usually best not to specify them at all. For context, older k0s releases used to include the images in the generated default configuration, but stopped doing so because of exactly the problem you're having here.
@twz123
Wow, that's brave. Updates are only supported from minor version to minor version, basically along the lines of what the Kubernetes Version Skew Policy mandates.
The error happend on a new installation instead of upgrade, didn't go all the way to do so, sorry for my inaccuracy (I edited the comment)
I think you may be suffering from hardcoded image versions in the k0s configuration. Can you check if you've explicitly specified the images in the k0s config? Unless you're using a private registry or something, it's usually best not to specify them at all. For context, older k0s releases used to include the images in the generated default configuration, but https://github.com/k0sproject/k0s/issues/2587 because of exactly the problem you're having here.
I don't explicitly specify the images, so I use default k0s version
K0s configures the pause image to be used by containerd according to the k0s configuration (which defaults to version 3.9 currently). So if containerd is not picking that up, there has to be something off with the containerd configuration. Can you check your containerd configuration files? Can you also check the contents of /run/k0s/containerd-cri.toml?
@twz123 For containerd config:
$ cat /run/k0s/containerd-cri.toml
cat: /run/k0s/containerd-cri.toml: No such file or directory
$ cat /etc/k0s/containerd.toml
# This is the configuration for k0s managed containerD.
# For reference see https://github.com/containerd/containerd/blob/main/docs/man/containerd-config.toml.5.md
version = 2
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
[plugins."io.containerd.grpc.v1.cri".containerd]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
This is our k0sctl configuration, if you need more informations:
apiVersion: k0sctl.k0sproject.io/v1beta1
kind: Cluster
metadata:
name: custom-cluster
spec:
hosts:
- localhost:
enabled: true
role: single
uploadBinary: true
k0sBinaryPath: /opt/custom/core/bin/k0s
installFlags:
- --profile=custom-k0s-profile
files:
- name: k0s-bundle
src: /opt/custom/core/images/k0s-airgap-bundle.tar
dstDir: /var/lib/k0s/images/
dst: ""
perm: "0755"
dirPerm: null
user: ""
group: ""
- name: containerd-config
src: /opt/custom/core/containerd.toml
dstDir: /etc/k0s/
dst: ""
perm: "0755"
dirPerm: null
user: ""
group: ""
k0s:
version: v1.30.4+k0s.0
config:
spec:
api:
extraArgs:
feature-gates: HPAScaleToZero=true
controllerManager:
extraArgs:
horizontal-pod-autoscaler-tolerance: "0.001"
images:
default_pull_policy: Never
network:
provider: calico
telemetry:
enabled: false
workerProfiles:
- name: custom-k0s-profile
values:
imageGCHighThresholdPercent: 100
imageMinimumGCAge: 876000h
maxPods: 200
- name: custom-k0s-profile-with-cpu-optimization
values:
cpuManagerPolicy: static
cpuManagerPolicyOptions:
full-pcpus-only: "true"
imageGCHighThresholdPercent: 100
imageMinimumGCAge: 876000h
maxPods: 200
systemReserved:
cpu: "4"
memory: 1Gi
And the worker configuration:
cat /var/lib/k0s/worker-profile.yaml
data:
apiServerAddresses: '["192.168.30.215:6443"]'
konnectivity: '{"agentPort":8132}'
kubeletConfiguration: '{"kind":"KubeletConfiguration","apiVersion":"kubelet.config.k8s.io/v1beta1","syncFrequency":"0s","fileCheckFrequency":"0s","httpCheckFrequency":"0s","tlsCipherSuites":["TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256","TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384","TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256","TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256","TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384","TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256"],"tlsMinVersion":"VersionTLS12","rotateCertificates":true,"serverTLSBootstrap":true,"authentication":{"x509":{},"webhook":{"cacheTTL":"0s"},"anonymous":{}},"authorization":{"webhook":{"cacheAuthorizedTTL":"0s","cacheUnauthorizedTTL":"0s"}},"eventRecordQPS":0,"clusterDomain":"cluster.local","clusterDNS":["10.96.0.10"],"streamingConnectionIdleTimeout":"0s","nodeStatusUpdateFrequency":"0s","nodeStatusReportFrequency":"0s","imageMinimumGCAge":"876000h0m0s","imageMaximumGCAge":"0s","imageGCHighThresholdPercent":100,"volumeStatsAggPeriod":"0s","cpuManagerReconcilePeriod":"0s","runtimeRequestTimeout":"0s","maxPods":200,"evictionPressureTransitionPeriod":"0s","failSwapOn":false,"memorySwap":{},"logging":{"flushFrequency":0,"verbosity":0,"options":{"text":{"infoBufferSize":"0"},"json":{"infoBufferSize":"0"}}},"shutdownGracePeriod":"0s","shutdownGracePeriodCriticalPods":"0s","containerRuntimeEndpoint":""}'
nodeLocalLoadBalancing: '{"type":"EnvoyProxy","envoyProxy":{"image":{"image":"quay.io/k0sproject/envoy-distroless","version":"v1.30.4"},"imagePullPolicy":"Never","apiServerBindPort":7443,"konnectivityServerBindPort":7132}}'
pauseImage: '{"image":"registry.k8s.io/pause","version":"3.9"}'
name: custom-k0s-profile
Can you delete the /etc/k0s/containerd.toml file and try again? It's an old one. The current version should have a header like this. I suppose it didn't get updated because you skipped several minor versions during the upgrade, and hence the code that did the migration to the newer versions wasn't executed.
@twz123 Okay I think I understand what's wrong, I overwrite containerd configuration to add credentials to my registry. There is a way to do this directly in k0s, I will try it and tell you if it works. Thanks you very much for your time!
@twz123 I can confirm that using predefined containerd configuration, everythings works. Sorry for disturbing you and thank you very much for your help
Hi, how did you manage to get your airgap install working with the registry config? I'm running into a similar issue where when I apply my containerd.toml and restart k0s OR just deploy with my modified config it gets stuck failing to pull any of the images and/or pause.
sorry forgot to @Its-Alex
@killergoalie I fixed it by using /etc/k0s/containerd.d/ instead of completly erase default containerd configuration as documented:
As of 1.27.1, k0s allows dynamic configuration of containerd CRI runtimes. This works by k0s creating a special directory in /etc/k0s/containerd.d/ where users can place partial containerd configuration files.
@Its-Alex thanks, last question are you using the wildcard with your registry config? Or explicitly calling out every registry? I'm trying to wrap my head around the V1/V2/V3 containerd config formats.
@killergoalie Not really sure what's your problem. I use somethings similar of this article. If I need to configure more than one registry I use one config per registry.
@Its-Alex I think my issue is I'm trying to use the older V1 wild card as you can find here: https://github.com/containerd/cri/blob/release/1.4/docs/registry.md#configure-registry-endpoint
I think I found a path forward, the end result is I didn't want to create configs for each mirror, just have a blanket for all of them. Hence the old config worked.
I did have a followup on the secrets are you using this for the registry auth? https://kubernetes.io/docs/reference/kubectl/generated/kubectl_create/kubectl_create_secret_docker-registry/
@killergoalie Sorry, but I lack a clear understanding of your problem. From what I can gather:
@Its-Alex I think my issue is that I'm trying to use the older V1 wildcard, as you can find here: https://github.com/containerd/cri/blob/release/1.4/docs/registry.md#configure-registry-endpoint
It seems you're trying to configure a proxy URL for multiple registries, not the registry itself here.
I did have a follow-up question about the secrets. Are you using this for registry authentication? https://kubernetes.io/docs/reference/kubectl/generated/kubectl_create/kubectl_create_secret_docker-registry/
There are many ways to handle registry authentication in Kubernetes, either directly in CRI or in Kubernetes (as explained in the documentation you shared).
I think this issue is not related to your problem, can you use Stack overflow, or another issue to resolve this problem?
@Its-Alex Thanks for your time, agreed I think these are not related. Will open an issue of my own. Again thanks for the time.