kops
kops copied to clipboard
dind not working using ubuntu 22.04.1 LTS image
/kind bug
1. What kops
version are you running? The command kops version
, will display
this information.
1.24.1
2. What Kubernetes version are you running? kubectl version
will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops
flag.
1.22.12
3. What cloud provider are you using? AWS
4. What commands did you run? What is the simplest way to reproduce this issue? Added the following image to the Instance group: 099720109477/ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-20220810
5. What happened after the commands executed?
We have configured some pods with dind containers that work with the image:
099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20220810
but after changing to the Ubuntu 22.04 image the logs show the error
cgroup mountpoint does not exist
6. What did you expect to happen? The same behaviour on either Ubuntu AMIs
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml
to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.
apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
creationTimestamp: "2018-09-14T12:20:46Z"
generation: 79
name: cluster-name
spec:
api:
loadBalancer:
class: Classic
type: Internal
authorization:
rbac: {}
certManager:
enabled: true
managed: false
channel: stable
cloudConfig:
awsEBSCSIDriver:
enabled: true
cloudControllerManager: {}
cloudLabels:
label1: label
env: prod
cloudProvider: aws
configBase: s3://bucket name
containerRuntime: containerd
containerd:
registryMirrors:
docker.io:
- https://proxy.url.com
- https://registry-1.docker.io
version: 1.6.6
dnsZone: cluster domain
etcdClusters:
- etcdMembers:
- encryptedVolume: true
instanceGroup: master-eu-central-1a
name: a
- encryptedVolume: true
instanceGroup: master-eu-central-1b
name: b
- encryptedVolume: true
instanceGroup: master-eu-central-1c
name: c
manager:
env:
- name: ETCD_MANAGER_HOURLY_BACKUPS_RETENTION
value: 15d
- name: ETCD_MANAGER_DAILY_BACKUPS_RETENTION
value: 90d
name: main
- etcdMembers:
- encryptedVolume: true
instanceGroup: master-eu-central-1a
name: a
- encryptedVolume: true
instanceGroup: master-eu-central-1b
name: b
- encryptedVolume: true
instanceGroup: master-eu-central-1c
name: c
manager:
env:
- name: ETCD_MANAGER_HOURLY_BACKUPS_RETENTION
value: 15d
- name: ETCD_MANAGER_DAILY_BACKUPS_RETENTION
value: 90d
name: events
externalPolicies:
master:
- arn:aws:iam::account:policy/efs_csi_volume_policy
- arn:aws:iam::account:policy:policy/ebs_csi_volume_policy
- arn:aws:iam::account:policy:policy/additional_master_node_policy
node:
- arn:aws:iam::account:policy:policy/efs_csi_volume_policy
- arn:aws:iam::account:policy:policy/additional_worker_node_policy
- arn:aws:iam::account:policy:policy/ebs_csi_volume_policy
iam:
legacy: false
kubeAPIServer:
allowPrivileged: true
kubeDNS:
nodeLocalDNS:
enabled: true
provider: CoreDNS
kubelet:
anonymousAuth: false
authenticationTokenWebhook: true
authorizationMode: Webhook
kubernetesApiAccess:
- 0.0.0.0/0
kubernetesVersion: v1.22.12
masterInternalName: api.xxxxx
masterPublicName: api.xxxxx
metricsServer:
enabled: true
networkCIDR: 10.xxx
networkID: vpc-xxx
networking:
flannel:
backend: vxlan
nonMasqueradeCIDR: 100.xxx
sshAccess:
- 10.xxx
subnets:
- cidr: 10.xxx
name: eu-central-1a
type: Private
zone: eu-central-1a
- cidr: 10.xxx
name: eu-central-1b
type: Private
zone: eu-central-1b
- cidr: 10.xxx
name: eu-central-1c
type: Private
zone: eu-central-1c
- cidr: 10.xxx
name: public-eu-central-1a
type: Utility
zone: eu-central-1a
- cidr: 10.xxx
name: public-eu-central-1b
type: Utility
zone: eu-central-1b
- cidr: 10.xxx
name: public-eu-central-1c
type: Utility
zone: eu-central-1c
topology:
dns:
type: Public
masters: private
nodes: private
9. Anything else do we need to know? Wanted to ask if there is some setting that could be enabled on the cluster spec regarding the cgroup setting. I've found this one, but I think it's not related to this issue. Also found the following PRs that are related to cgroup enablement (in this case the systemd should be enabled by default since we are using k8s 1.22.12) https://github.com/kubernetes/kops/pull/12917 https://github.com/kubernetes/kops/pull/10846
This is the pod template we are using for dind:
- name: dind
env:
- name: DOCKER_TLS_CERTDIR
value: ''
image: docker:stable-dind
command: [ 'dockerd-entrypoint.sh', '--registry-mirror', 'https://url']
securityContext:
privileged: true
volumeMounts:
- name: dind-storage
mountPath: /var/lib/docker
What I suspect is that the cgroup is somehow missing from the Ubuntu 22.04 base AMI, but since kops supports the image just wanted to ask if I've missed some configuration. Like mentioned before keeping the same k8s version and the latest kops version 1.24.1 and reverting back to Ubuntu 20.04 AMI, the builds started working again. Thanks for the help!
We know of several issues with ubuntu 2204. So while you can use it as a base image, I wouldn't call it supported. We''ll track this issue though.
Thanks for the confirmation @olemarkus. I've reverted back to Ubuntu 20.04 and everything is working fine. The ticket can serve as tracking as you said.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.