bottlerocket Occasional "exec format error"

Image I'm using: AMI: bottlerocket-aws-k8s-1.22-x86_64-v1.9.0-159e4ced EKS:

Server Version: version.Info{Major:"1", Minor:"22+", GitVersion:"v1.22.10-eks-84b4fe6", GitCommit:"cc6a1b4915a99f49f5510ef0667f94b9ca832a8a", GitTreeState:"clean", BuildDate:"2022-06-09T18:24:04Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}

Occasionally containers are refusing to start, for example:

➜  ~ k -n argocd logs argocd-server-67f9f4788c-jx6nv
exec /usr/local/bin/argocd-server: exec format error

➜  ~ k -n argocd get pod -o wide | grep argocd-server
argocd-server-67f9f4788c-jx6nv                      0/1     CrashLoopBackOff   28 (2m25s ago)   120m   10.98.70.121   ip-10-98-70-152.eu-west-1.compute.internal   <none>           <none>
argocd-server-67f9f4788c-nnxxc                      1/1     Running            0                17h    10.98.60.136   ip-10-98-60-34.eu-west-1.compute.internal    <none>           <none>

➜  ~ k -n argocd get pod -l app.kubernetes.io/name=argocd-server -o yaml | grep image
      image: quay.io/argoproj/argocd:v2.4.7
      imagePullPolicy: Always
      image: quay.io/argoproj/argocd:v2.4.7
      imageID: quay.io/argoproj/argocd@sha256:f887f854ab22f7f29f915aae2b765f2948d1555d61e9ce3ca9e659f8df22ab2b
      image: quay.io/argoproj/argocd:v2.4.7
      imagePullPolicy: Always
      image: quay.io/argoproj/argocd:v2.4.7
      imageID: quay.io/argoproj/argocd@sha256:f887f854ab22f7f29f915aae2b765f2948d1555d61e9ce3ca9e659f8df22ab2b

➜  ~ k describe node ip-10-98-70-152.eu-west-1.compute.internal ip-10-98-60-34.eu-west-1.compute.internal
Name:               ip-10-98-70-152.eu-west-1.compute.internal
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=m5a.large
                    beta.kubernetes.io/os=linux
                    bottlerocket.aws/updater-interface-version=2.0.0
                    cni=cilium
                    eks.amazonaws.com/capacityType=SPOT
                    eks.amazonaws.com/nodegroup=sta-0c008cd98efe613ce-20220812085801371900000017
                    eks.amazonaws.com/nodegroup-image=ami-05ef4fee9a548939d
                    eks.amazonaws.com/sourceLaunchTemplateId=lt-0e03ae3eb8ac34efd
                    eks.amazonaws.com/sourceLaunchTemplateVersion=2
                    failure-domain.beta.kubernetes.io/region=eu-west-1
                    failure-domain.beta.kubernetes.io/zone=eu-west-1c
                    k8s.io/cloud-provider-aws=2a1dc32348e49d8c3d205b11025253c8
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-10-98-70-152.eu-west-1.compute.internal
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=m5a.large
                    topology.ebs.csi.aws.com/zone=eu-west-1c
                    topology.kubernetes.io/region=eu-west-1
                    topology.kubernetes.io/zone=eu-west-1c
Annotations:        csi.volume.kubernetes.io/nodeid: {"ebs.csi.aws.com":"i-0059589916b9e311c"}
                    io.cilium.network.ipv4-cilium-host: 10.98.70.90
                    io.cilium.network.ipv4-health-ip: 10.98.70.88
                    io.cilium.network.ipv4-pod-cidr: 10.152.0.0/16
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Mon, 15 Aug 2022 21:06:22 +0200
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  ip-10-98-70-152.eu-west-1.compute.internal
  AcquireTime:     <unset>
  RenewTime:       Tue, 16 Aug 2022 14:06:34 +0200
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  ReadonlyFilesystem   False   Tue, 16 Aug 2022 14:03:31 +0200   Mon, 15 Aug 2022 21:11:35 +0200   FilesystemIsNotReadOnly      Filesystem is not read-only
  KernelDeadlock       False   Tue, 16 Aug 2022 14:03:31 +0200   Mon, 15 Aug 2022 21:11:35 +0200   KernelHasNoDeadlock          kernel has no deadlock
  NetworkUnavailable   False   Mon, 15 Aug 2022 21:07:00 +0200   Mon, 15 Aug 2022 21:07:00 +0200   CiliumIsUp                   Cilium is running on this node
  MemoryPressure       False   Tue, 16 Aug 2022 14:02:10 +0200   Mon, 15 Aug 2022 21:06:21 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Tue, 16 Aug 2022 14:02:10 +0200   Mon, 15 Aug 2022 21:06:21 +0200   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Tue, 16 Aug 2022 14:02:10 +0200   Mon, 15 Aug 2022 21:06:21 +0200   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Tue, 16 Aug 2022 14:02:10 +0200   Tue, 16 Aug 2022 11:32:04 +0200   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:   10.98.70.152
  Hostname:     ip-10-98-70-152.eu-west-1.compute.internal
  InternalDNS:  ip-10-98-70-152.eu-west-1.compute.internal
Capacity:
  attachable-volumes-aws-ebs:  25
  cpu:                         2
  ephemeral-storage:           41261776Ki
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      7877168Ki
  pods:                        29
Allocatable:
  attachable-volumes-aws-ebs:  25
  cpu:                         1930m
  ephemeral-storage:           36953110875
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      7186992Ki
  pods:                        29
System Info:
  Machine ID:                 ec294af3fb7450c1a90fe38fb1e8d2e9
  System UUID:                ec294af3-fb74-50c1-a90f-e38fb1e8d2e9
  Boot ID:                    f9433ce1-8279-403d-9006-30f9e7607973
  Kernel Version:             5.10.130
  OS Image:                   Bottlerocket OS 1.9.0 (aws-k8s-1.22)
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.6.6+bottlerocket
  Kubelet Version:            v1.22.10-eks-7dc61e8
  Kube-Proxy Version:         v1.22.10-eks-7dc61e8
ProviderID:                   aws:///eu-west-1c/i-0059589916b9e311c
Non-terminated Pods:          (24 in total)
  Namespace                   Name                                            CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                                            ------------  ----------  ---------------  -------------  ---
  argocd                      argocd-redis-ha-haproxy-5b46945689-f4j5s        10m (0%)      0 (0%)      100Mi (1%)       256Mi (3%)     14h
  argocd                      argocd-redis-ha-server-1                        20m (1%)      0 (0%)      96Mi (1%)        320Mi (4%)     16h
  argocd                      argocd-server-67f9f4788c-jx6nv                  10m (0%)      0 (0%)      64Mi (0%)        256Mi (3%)     121m
  aws-ebs-csi-driver          ebs-csi-node-fm96d                              30m (1%)      0 (0%)      36M (0%)         387M (5%)      16h
  backoffice-cm               backoffice-cm-deployment-64b7d7f579-56qvg       500m (25%)    0 (0%)      1Gi (14%)        2Gi (29%)      5h6m
  brupop-bottlerocket-aws     brupop-agent-f45ps                              10m (0%)      0 (0%)      34M (0%)         45M (0%)       16h
  brupop-bottlerocket-aws     brupop-apiserver-54f6795bd-4256b                10m (0%)      0 (0%)      34M (0%)         45M (0%)       14h
  brupop-bottlerocket-aws     brupop-controller-deployment-8c778d976-w64db    10m (0%)      0 (0%)      34M (0%)         45M (0%)       14h
  cert-manager                cert-manager-cainjector-6f7597f6f4-hgwgv        10m (0%)      0 (0%)      76M (1%)         1565M (21%)    14h
  cilium                      cilium-csl7q                                    100m (5%)     0 (0%)      100Mi (1%)       0 (0%)         154m
  csi-snapshots               csi-snapshotter-0                               30m (1%)      0 (0%)      72M (0%)         294M (3%)      14h
  goldilocks                  goldilocks-controller-57d695bdf5-tzcj2          22m (1%)      321m (16%)  161Mi (2%)       1503Mi (21%)   14h
  goldilocks                  goldilocks-dashboard-6b5b46dbc8-99z8t           10m (0%)      101m (5%)   34Mi (0%)        218Mi (3%)     14h
  goldilocks                  vpa-recommender-6b97cd4fb4-vx8bm                10m (0%)      101m (5%)   49Mi (0%)        458Mi (6%)     14h
  ingress-nginx-internal      ingress-nginx-controller-7764f4fddd-2q56k       10m (0%)      0 (0%)      108M (1%)        1104M (15%)    14h
  keda                        keda-metrics-apiserver-6c487b448d-p2mnk         100m (5%)     0 (0%)      100Mi (1%)       1000Mi (14%)   14h
  kube-system                 coredns-5947f47f5f-94t4s                        100m (5%)     0 (0%)      70Mi (0%)        170Mi (2%)     14h
  kube-system                 kube-proxy-7xmrt                                100m (5%)     0 (0%)      0 (0%)           0 (0%)         17h
  kube-system                 metrics-server-848988b695-wm9m5                 10m (0%)      0 (0%)      36M (0%)         834M (11%)     16h
  node-problem-detector       node-problem-detector-5556g                     10m (0%)      0 (0%)      34M (0%)         1122M (15%)    16h
  polaris                     polaris-dashboard-c696fb87-j9g7s                100m (5%)     0 (0%)      128Mi (1%)       512Mi (7%)     5h6m
  postgres-operator           ext-postgres-operator-dbdb98546-56xg5           10m (0%)      0 (0%)      34M (0%)         246M (3%)      5h6m
  prometheus                  prometheus-prometheus-node-exporter-g9wmh       10m (0%)      0 (0%)      16Mi (0%)        32Mi (0%)      16h
  promtail                    promtail-q79pc                                  100m (5%)     0 (0%)      64Mi (0%)        512Mi (7%)     16h
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests          Limits
  --------                    --------          ------
  cpu                         1332m (69%)       523m (27%)
  memory                      2601443456 (35%)  13325876160 (181%)
  ephemeral-storage           0 (0%)            0 (0%)
  hugepages-1Gi               0 (0%)            0 (0%)
  hugepages-2Mi               0 (0%)            0 (0%)
  attachable-volumes-aws-ebs  0                 0
Events:                       <none>


Name:               ip-10-98-60-34.eu-west-1.compute.internal
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=m5.large
                    beta.kubernetes.io/os=linux
                    bottlerocket.aws/updater-interface-version=2.0.0
                    cni=cilium
                    eks.amazonaws.com/capacityType=SPOT
                    eks.amazonaws.com/nodegroup=sta-0566348ba06653b8a-20220812085801373500000019
                    eks.amazonaws.com/nodegroup-image=ami-05ef4fee9a548939d
                    eks.amazonaws.com/sourceLaunchTemplateId=lt-08fd57448fa5945b5
                    eks.amazonaws.com/sourceLaunchTemplateVersion=2
                    failure-domain.beta.kubernetes.io/region=eu-west-1
                    failure-domain.beta.kubernetes.io/zone=eu-west-1b
                    k8s.io/cloud-provider-aws=2a1dc32348e49d8c3d205b11025253c8
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-10-98-60-34.eu-west-1.compute.internal
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=m5.large
                    topology.ebs.csi.aws.com/zone=eu-west-1b
                    topology.kubernetes.io/region=eu-west-1
                    topology.kubernetes.io/zone=eu-west-1b
Annotations:        csi.volume.kubernetes.io/nodeid: {"ebs.csi.aws.com":"i-052270355dc863cb5"}
                    io.cilium.network.ipv4-cilium-host: 10.98.60.134
                    io.cilium.network.ipv4-health-ip: 10.98.60.172
                    io.cilium.network.ipv4-pod-cidr: 10.34.0.0/16
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Mon, 15 Aug 2022 20:59:53 +0200
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  ip-10-98-60-34.eu-west-1.compute.internal
  AcquireTime:     <unset>
  RenewTime:       Tue, 16 Aug 2022 14:06:38 +0200
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  KernelDeadlock       False   Tue, 16 Aug 2022 14:04:50 +0200   Mon, 15 Aug 2022 21:02:53 +0200   KernelHasNoDeadlock          kernel has no deadlock
  ReadonlyFilesystem   False   Tue, 16 Aug 2022 14:04:50 +0200   Mon, 15 Aug 2022 21:02:53 +0200   FilesystemIsNotReadOnly      Filesystem is not read-only
  NetworkUnavailable   False   Mon, 15 Aug 2022 21:00:28 +0200   Mon, 15 Aug 2022 21:00:28 +0200   CiliumIsUp                   Cilium is running on this node
  MemoryPressure       False   Tue, 16 Aug 2022 14:05:00 +0200   Mon, 15 Aug 2022 21:02:47 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Tue, 16 Aug 2022 14:05:00 +0200   Mon, 15 Aug 2022 21:02:47 +0200   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Tue, 16 Aug 2022 14:05:00 +0200   Mon, 15 Aug 2022 21:02:47 +0200   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Tue, 16 Aug 2022 14:05:00 +0200   Tue, 16 Aug 2022 11:32:24 +0200   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:   10.98.60.34
  Hostname:     ip-10-98-60-34.eu-west-1.compute.internal
  InternalDNS:  ip-10-98-60-34.eu-west-1.compute.internal
Capacity:
  attachable-volumes-aws-ebs:  25
  cpu:                         2
  ephemeral-storage:           41261776Ki
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      7918136Ki
  pods:                        29
Allocatable:
  attachable-volumes-aws-ebs:  25
  cpu:                         1930m
  ephemeral-storage:           36953110875
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      7227960Ki
  pods:                        29
System Info:
  Machine ID:                 ec2b47de995c9eb1d1e7be45efe6de37
  System UUID:                ec2b47de-995c-9eb1-d1e7-be45efe6de37
  Boot ID:                    66b5b8bd-d9d7-4a44-b4e1-4500dd7f25bd
  Kernel Version:             5.10.130
  OS Image:                   Bottlerocket OS 1.9.0 (aws-k8s-1.22)
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.6.6+bottlerocket
  Kubelet Version:            v1.22.10-eks-7dc61e8
  Kube-Proxy Version:         v1.22.10-eks-7dc61e8
ProviderID:                   aws:///eu-west-1b/i-052270355dc863cb5
Non-terminated Pods:          (28 in total)
  Namespace                   Name                                                 CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                                                 ------------  ----------  ---------------  -------------  ---
  argocd                      argocd-applicationset-controller-84c7fbf59c-d7cch    10m (0%)      0 (0%)      50Mi (0%)        100Mi (1%)     17h
  argocd                      argocd-redis-ha-haproxy-5b46945689-4l8px             10m (0%)      0 (0%)      100Mi (1%)       256Mi (3%)     17h
  argocd                      argocd-repo-server-6ff96df8d6-vxc7f                  500m (25%)    0 (0%)      128Mi (1%)       1536Mi (21%)   14h
  argocd                      argocd-server-67f9f4788c-nnxxc                       10m (0%)      0 (0%)      64Mi (0%)        256Mi (3%)     17h
  aws-ebs-csi-driver          ebs-csi-controller-7d7dbd98-kg6vj                    60m (3%)      0 (0%)      96M (1%)         776M (10%)     17h
  aws-ebs-csi-driver          ebs-csi-node-zwlll                                   30m (1%)      0 (0%)      36M (0%)         387M (5%)      17h
  backoffice-cm               kafka-connect-deployment-c79668bc6-x9fdr             0 (0%)        0 (0%)      0 (0%)           0 (0%)         17h
  brupop-bottlerocket-aws     brupop-agent-gtdvt                                   10m (0%)      0 (0%)      34M (0%)         45M (0%)       17h
  brupop-bottlerocket-aws     brupop-apiserver-54f6795bd-gh7q7                     10m (0%)      0 (0%)      34M (0%)         45M (0%)       17h
  cert-manager                cert-manager-788b7f5656-gnz6f                        10m (0%)      0 (0%)      35M (0%)         727M (9%)      17h
  cert-manager                cert-manager-webhook-8664f454c9-4z5nm                10m (0%)      0 (0%)      34M (0%)         472M (6%)      14h
  cilium                      cilium-45tn7                                         100m (5%)     0 (0%)      100Mi (1%)       0 (0%)         154m
  cilium                      hubble-ui-5d9b569687-f4rcw                           0 (0%)        0 (0%)      0 (0%)           0 (0%)         17h
  csi-snapshots               snapshot-controller-65b949c64c-v92dq                 10m (0%)      0 (0%)      34M (0%)         98M (1%)       14h
  ingress-nginx               ingress-nginx-controller-6755cfb894-8kvl5            10m (0%)      0 (0%)      108M (1%)        1104M (14%)    17h
  keda                        keda-metrics-apiserver-6c487b448d-wr828              100m (5%)     0 (0%)      100Mi (1%)       1000Mi (14%)   17h
  keda                        keda-operator-8449c9c8b8-mmzx6                       100m (5%)     0 (0%)      100Mi (1%)       1000Mi (14%)   17h
  kube-downscaler             kube-downscaler-86cbf4f595-h28dv                     10m (0%)      0 (0%)      49M (0%)         459M (6%)      17h
  kube-system                 cluster-autoscaler-5fb96c46c7-chhjb                  10m (0%)      0 (0%)      125M (1%)        1482M (20%)    17h
  kube-system                 coredns-5947f47f5f-ks6zp                             100m (5%)     0 (0%)      70Mi (0%)        170Mi (2%)     17h
  kube-system                 descheduler-d78748c74-6j98w                          10m (0%)      0 (0%)      34Mi (0%)        1Gi (14%)      17h
  kube-system                 kube-proxy-w789r                                     100m (5%)     0 (0%)      0 (0%)           0 (0%)         17h
  kube-system                 metrics-server-848988b695-7tmzv                      10m (0%)      0 (0%)      36M (0%)         834M (11%)     17h
  node-problem-detector       node-problem-detector-8jzhx                          10m (0%)      0 (0%)      34M (0%)         1122M (15%)    17h
  polaris                     polaris-dashboard-c696fb87-55hsd                     100m (5%)     0 (0%)      128Mi (1%)       512Mi (7%)     122m
  prometheus                  prometheus-prometheus-node-exporter-dwjz9            10m (0%)      0 (0%)      16Mi (0%)        32Mi (0%)      17h
  prometheus                  prometheus-prometheus-prometheus-0                   310m (16%)    200m (10%)  1138Mi (16%)     4402Mi (62%)   17h
  promtail                    promtail-wxqph                                       100m (5%)     0 (0%)      64Mi (0%)        512Mi (7%)     17h
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests          Limits
  --------                    --------          ------
  cpu                         1750m (90%)       200m (10%)
  memory                      2848620992 (38%)  18875620800 (255%)
  ephemeral-storage           0 (0%)            0 (0%)
  hugepages-1Gi               0 (0%)            0 (0%)
  hugepages-2Mi               0 (0%)            0 (0%)
  attachable-volumes-aws-ebs  0                 0
Events:                       <none>

Happens randomly to various containers, like prometheus etc. No, I'm not trying to run amd64 compiled ones.

How to reproduce the problem: Honestly can not figure this out, happens randomly. Thought it might be related to kubearmor or cilium (IPAM mode) but its not.

NOTE Left clusters running on AL2 image for few days, never happened.

Any help appreciated.

Aug 16 '22 12:08 michalschott

Hi, thanks for opening this issue.

Is Bottlerocket 1.9.0 the first version you've used? I'm wondering if you've hit this problem in previous Bottlerocket releases. This could help us narrow down the issue.

Otherwise, I'm wondering if you could check dmesg or the hosts systemd journal to see if e.g. there are e.g. SELinux-related access denials or other indicators of an error on the Bottlerocket host. I'd also be curious to have a sense of what settings you've enabled via Userdata or the Bottlerocket API.

Aug 17 '22 16:08 cbgbt

I'm going to try to reproduce this issue. How did you install Argo, Prometheus, etc? via Helm charts, or some other method?

Aug 19 '22 16:08 mchaker

Hi, thanks for opening this issue.

Is Bottlerocket 1.9.0 the first version you've used? I'm wondering if you've hit this problem in previous Bottlerocket releases. This could help us narrow down the issue.

Otherwise, I'm wondering if you could check dmesg or the hosts systemd journal to see if e.g. there are e.g. SELinux-related access denials or other indicators of an error on the Bottlerocket host. I'd also be curious to have a sense of what settings you've enabled via Userdata or the Bottlerocket API.

@cbgbt

I didn't see that in 1.8, at least for the month before 1.9 was released. Occasionally used older releases for POC just to present this flavour to wider group of techinical people.

No userdata provided at all, nor API - just vanilla AMI from AWS.

For dmesg, I'll need to reprovision nodes with a SSH key first then figure out how to run admin container in it (never had a need to run it before). I'll do that next week and come back with more details next time I see this error.

Aug 21 '22 15:08 michalschott

I'm going to try to reproduce this issue. How did you install Argo, Prometheus, etc? via Helm charts, or some other method?

@mchaker

I never do helm install, instead I do template some charts and apply overlays with kustomize. Anyway, manifests for both Argo and Prometheus available here

Aug 21 '22 18:08 michalschott

Do you think https://kubearmor.io might be the cause as well?

Aug 24 '22 20:08 michalschott

I never do helm install, instead I do template some charts and apply overlays with kustomize. Anyway, manifests for both Argo and Prometheus available here

I see. Would it be possible for you to try some things out in order to help me narrow down the issue?:

Run kubectl describe pod FAILING_POD_NAME_HERE from wherever you administer your kubernetes cluster
Find the Container ID for the failing container in that pod and copy/save that Container ID somewhere -- we'll need it soon. I'll call it LONG_HEX_CONTAINER_ID_HERE later in these steps.
Gain access to a Bottlerocket node with that failing container
(Enable and) gain access to the admin container on that Bottlerocket node (e.g. apiclient set host-containers.admin.enabled=true via SSM or from within the control container)
From within the admin container:
1. Run cat /proc/self/mountinfo | grep LONG_HEX_CONTAINER_ID_HERE and make sure you see a line including /.bottlerocket/rootfs/run/containerd/...
2. Run ls -al /.bottlerocket/rootfs/run/containerd/io.containerd.runtime.v2.task/k8s.io/LONG_HEX_CONTAINER_ID_HERE/rootfs/PATH_TO_YOUR_ENTRY_POINT_EXECUTABLE/ENTRY_POINT_EXECUTABLE (we want to check the entry point executable's size to make sure it isn't zero or truncated)
3. Run yum install binutils and accept/continue with installing the binutils package
4. Run readelf -h /.bottlerocket/rootfs/run/containerd/io.containerd.runtime.v2.task/k8s.io/LONG_HEX_CONTAINER_ID_HERE/rootfs/PATH_TO_YOUR_ENTRY_POINT_EXECUTABLE/ENTRY_POINT_EXECUTABLE and look at the Class and Machine attributes to make sure they are what you expect for architecture.

Could you let me know what you find?

Aug 25 '22 22:08 mchaker

FYI I've brought my cluster to the point where I could observe that error. Waiting for some failures, and I'll report back ASAP.

Aug 29 '22 10:08 michalschott

So I have this happened again:

prometheus-prometheus-prometheus-0              2/3     CrashLoopBackOff   8 (4m42s ago)   16m

Container IDs:

  init-config-reloader:
    Container ID:  containerd://8055db641f8ae4ca44bd57ff664c3c53659c695d5a57ca2b15dc2c626b9276d4
--
      /etc/prometheus/rules/prometheus-prometheus-prometheus-rulefiles-0 from prometheus-prometheus-prometheus-rulefiles-0 (rw)
      /var/run/secrets/eks.amazonaws.com/serviceaccount from aws-iam-token (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-wfw95 (ro)
Containers:
  prometheus:
    Container ID:  containerd://da493ba3c0564b76405fdb9687c2f5ea603964fa659c15b1b6a51a6cb9fdf42b
--
      /var/run/secrets/eks.amazonaws.com/serviceaccount from aws-iam-token (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-wfw95 (ro)
  config-reloader:
    Container ID:  containerd://7fc14acdb6ae2faff47a941619820dc73784ec2f0c3535d96e2b49b21387c883
--
      /var/run/secrets/eks.amazonaws.com/serviceaccount from aws-iam-token (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-wfw95 (ro)
  thanos-sidecar:
    Container ID:  containerd://80f669405b0033e01518738aba0c4a1429e62f98835c63da7f9cab4b04a389e3

But:

[root@admin]# ls -al /.bottlerocket/rootfs/run/containerd/io.containerd.runtime.v2.task/k8s.io/8055db641f8ae4ca44bd57ff664c3c53659c695d5a57ca2b15dc2c626b9276d4/
ls: cannot access /.bottlerocket/rootfs/run/containerd/io.containerd.runtime.v2.task/k8s.io/8055db641f8ae4ca44bd57ff664c3c53659c695d5a57ca2b15dc2c626b9276d4/: No such file or directory
[root@admin]# ls -al /.bottlerocket/rootfs/run/containerd/io.containerd.runtime.v2.task/k8s.io/da493ba3c0564b76405fdb9687c2f5ea603964fa659c15b1b6a51a6cb9fdf42b/
ls: cannot access /.bottlerocket/rootfs/run/containerd/io.containerd.runtime.v2.task/k8s.io/da493ba3c0564b76405fdb9687c2f5ea603964fa659c15b1b6a51a6cb9fdf42b/: No such file or directory
[root@admin]# ls -al /.bottlerocket/rootfs/run/containerd/io.containerd.runtime.v2.task/k8s.io/7fc14acdb6ae2faff47a941619820dc73784ec2f0c3535d96e2b49b21387c883/
total 40
drwx------.  3 root root  240 Aug 29 15:02 .
drwx--x--x. 42 root root  840 Aug 29 15:18 ..
-rw-r--r--.  1 root root   89 Aug 29 15:02 address
-rw-r--r--.  1 root root 8415 Aug 29 15:02 config.json
-rw-r--r--.  1 root root    4 Aug 29 15:02 init.pid
prwx------.  1 root root    0 Aug 29 15:02 log
-rw-r--r--.  1 root root    0 Aug 29 15:02 log.json
-rw-------.  1 root root   47 Aug 29 15:02 options.json
drwxr-xr-x.  1 root root 4096 Aug 29 15:02 rootfs
-rw-------.  1 root root    7 Aug 29 15:02 runtime
-rw-------.  1 root root   32 Aug 29 15:02 shim-binary-path
lrwxrwxrwx.  1 root root  121 Aug 29 15:02 work -> /var/lib/containerd/io.containerd.runtime.v2.task/k8s.io/7fc14acdb6ae2faff47a941619820dc73784ec2f0c3535d96e2b49b21387c883
[root@admin]# ls -al /.bottlerocket/rootfs/run/containerd/io.containerd.runtime.v2.task/k8s.io/80f669405b0033e01518738aba0c4a1429e62f98835c63da7f9cab4b04a389e3/
total 36
drwx------.  3 root root  240 Aug 29 15:12 .
drwx--x--x. 42 root root  840 Aug 29 15:18 ..
-rw-r--r--.  1 root root   89 Aug 29 15:12 address
-rw-r--r--.  1 root root 7994 Aug 29 15:12 config.json
-rw-r--r--.  1 root root    5 Aug 29 15:12 init.pid
prwx------.  1 root root    0 Aug 29 15:12 log
-rw-r--r--.  1 root root    0 Aug 29 15:12 log.json
-rw-------.  1 root root   47 Aug 29 15:12 options.json
drwxr-xr-x.  1 root root 4096 Aug 29 15:12 rootfs
-rw-------.  1 root root    7 Aug 29 15:12 runtime
-rw-------.  1 root root   32 Aug 29 15:12 shim-binary-path
lrwxrwxrwx.  1 root root  121 Aug 29 15:12 work -> /var/lib/containerd/io.containerd.runtime.v2.task/k8s.io/80f669405b0033e01518738aba0c4a1429e62f98835c63da7f9cab4b04a389e3

Assume since this container it not running, I can not have access to its filesystem with the patch pattern you have mentioned @mchaker

Thought it might be due to disk being full, but looks good:

Filesystem       Size  Used Avail Use% Mounted on
overlay           40G  9.7G   29G  26% /
tmpfs             64M     0   64M   0% /dev
shm               64M     0   64M   0% /dev/shm
tmpfs             64M  4.0K   64M   1% /run
tmpfs            3.8G  476K  3.8G   1% /etc/hosts
/dev/nvme2n1p1    40G  9.7G   29G  26% /.bottlerocket/support
tmpfs            1.6G   17M  1.5G   2% /run/api.sock
/dev/root        904M  581M  261M  70% /.bottlerocket/rootfs
tmpfs            3.8G     0  3.8G   0% /sys/fs/cgroup
devtmpfs         3.8G     0  3.8G   0% /.bottlerocket/rootfs/dev
tmpfs            3.8G     0  3.8G   0% /.bottlerocket/rootfs/dev/shm
tmpfs            4.0M     0  4.0M   0% /.bottlerocket/rootfs/sys/fs/cgroup
overlay           40G  9.7G   29G  26% /.bottlerocket/rootfs/run/host-containerd/io.containerd.runtime.v2.task/default/control/rootfs
tmpfs            3.8G   16K  3.8G   1% /.bottlerocket/rootfs/etc/cni
tmpfs            3.8G   12K  3.8G   1% /.bottlerocket/rootfs/etc/host-containers
tmpfs            3.8G  4.0K  3.8G   1% /.bottlerocket/rootfs/etc/containerd
tmpfs            3.8G     0  3.8G   0% /.bottlerocket/rootfs/tmp
overlay           40G  9.7G   29G  26% /.bottlerocket/rootfs/opt/cni/bin
/dev/nvme0n1p12   36M  860K   32M   3% /.bottlerocket/rootfs/var/lib/bottlerocket
/dev/loop1        12M   12M     0 100% /.bottlerocket/rootfs/var/lib/kernel-devel/.overlay/lower
overlay           40G  9.7G   29G  26% /usr/lib/modules
/dev/loop0       384K  384K     0 100% /.bottlerocket/rootfs/x86_64-bottlerocket-linux-gnu/sys-root/usr/share/licenses
/dev/nvme0n1p8    14M   13M  539K  96% /.bottlerocket/rootfs/boot
overlay           40G  9.7G   29G  26% /usr/src/kernels

Node events:

Events:
  Type     Reason                   Age                From        Message
  ----     ------                   ----               ----        -------
  Normal   Starting                 34m                kube-proxy
  Normal   Starting                 37m                kube-proxy
  Normal   NodeSchedulable          40m                kubelet     Node ip-10-200-60-82.eu-west-1.compute.internal status is now: NodeSchedulable
  Normal   Starting                 37m                kubelet     Starting kubelet.
  Warning  InvalidDiskCapacity      37m                kubelet     invalid capacity 0 on image filesystem
  Normal   NodeHasSufficientPID     37m (x2 over 37m)  kubelet     Node ip-10-200-60-82.eu-west-1.compute.internal status is now: NodeHasSufficientPID
  Normal   NodeAllocatableEnforced  37m                kubelet     Updated Node Allocatable limit across pods
  Warning  Rebooted                 37m                kubelet     Node ip-10-200-60-82.eu-west-1.compute.internal has been rebooted, boot id: 254fa6e2-e5e7-4676-8689-ea48d93eab17
  Normal   NodeNotReady             37m                kubelet     Node ip-10-200-60-82.eu-west-1.compute.internal status is now: NodeNotReady
  Normal   NodeHasSufficientMemory  37m (x2 over 37m)  kubelet     Node ip-10-200-60-82.eu-west-1.compute.internal status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    37m (x2 over 37m)  kubelet     Node ip-10-200-60-82.eu-west-1.compute.internal status is now: NodeHasNoDiskPressure
  Normal   NodeReady                37m                kubelet     Node ip-10-200-60-82.eu-west-1.compute.internal status is now: NodeReady
  Warning  InvalidDiskCapacity      34m                kubelet     invalid capacity 0 on image filesystem
  Normal   NodeHasSufficientMemory  34m (x2 over 34m)  kubelet     Node ip-10-200-60-82.eu-west-1.compute.internal status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    34m (x2 over 34m)  kubelet     Node ip-10-200-60-82.eu-west-1.compute.internal status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     34m (x2 over 34m)  kubelet     Node ip-10-200-60-82.eu-west-1.compute.internal status is now: NodeHasSufficientPID
  Normal   NodeAllocatableEnforced  34m                kubelet     Updated Node Allocatable limit across pods
  Warning  Rebooted                 34m                kubelet     Node ip-10-200-60-82.eu-west-1.compute.internal has been rebooted, boot id: c01886cf-2895-44f5-abcc-53a05cc2d79c
  Normal   NodeNotReady             34m                kubelet     Node ip-10-200-60-82.eu-west-1.compute.internal status is now: NodeNotReady
  Normal   Starting                 34m                kubelet     Starting kubelet.
  Normal   NodeReady                34m                kubelet     Node ip-10-200-60-82.eu-west-1.compute.internal status is now: NodeReady

Pods:

~ k get pod -o wide -A | grep ip-10-200-60-82.eu-west-1.compute.internal
aws-ebs-csi-driver         ebs-csi-node-hqg78                                      3/3     Running            6 (35m ago)      5h9m    10.200.60.33    ip-10-200-60-82.eu-west-1.compute.internal    <none>           <none>
brupop-bottlerocket-aws    brupop-agent-d7w5b                                      1/1     Running            2 (35m ago)      5h9m    10.200.60.10    ip-10-200-60-82.eu-west-1.compute.internal    <none>           <none>
cilium                     cilium-6lpnb                                            1/1     Running            2 (35m ago)      5h9m    10.200.60.82    ip-10-200-60-82.eu-west-1.compute.internal    <none>           <none>
grafana                    grafana-878447b59-qq6kp                                 4/4     Running            0                31m     10.200.60.134   ip-10-200-60-82.eu-west-1.compute.internal    <none>           <none>
kube-system                coredns-6ddcb96f44-lqnwr                                1/1     Running            0                31m     10.200.60.127   ip-10-200-60-82.eu-west-1.compute.internal    <none>           <none>
kube-system                kube-proxy-xf7qm                                        1/1     Running            2 (35m ago)      5h9m    10.200.60.82    ip-10-200-60-82.eu-west-1.compute.internal    <none>           <none>
kubearmor                  kubearmor-fcs4l                                         1/1     Running            2 (35m ago)      109m    10.200.60.82    ip-10-200-60-82.eu-west-1.compute.internal    <none>           <none>
linkerd2-cni               linkerd-cni-r94dm                                       1/1     Running            2 (35m ago)      109m    10.200.60.82    ip-10-200-60-82.eu-west-1.compute.internal    <none>           <none>
loki                       loki-loki-distributed-gateway-b5c68f69c-k8lfr           1/1     Running            0                31m     10.200.60.128   ip-10-200-60-82.eu-west-1.compute.internal    <none>           <none>
loki                       loki-loki-distributed-query-frontend-6d4cc5555b-26nhn   1/1     Running            0                31m     10.200.60.16    ip-10-200-60-82.eu-west-1.compute.internal    <none>           <none>
node-problem-detector      node-problem-detector-mxxtq                             1/1     Running            2 (35m ago)      5h9m    10.200.60.82    ip-10-200-60-82.eu-west-1.compute.internal    <none>           <none>
prometheus                 prometheus-prometheus-node-exporter-6nnmc               1/1     Running            2 (35m ago)      5h9m    10.200.60.82    ip-10-200-60-82.eu-west-1.compute.internal    <none>           <none>
prometheus                 prometheus-prometheus-prometheus-0                      2/3     CrashLoopBackOff   14 (3m38s ago)   36m     10.200.60.27    ip-10-200-60-82.eu-west-1.compute.internal    <none>           <none>
promtail                   promtail-nwwhw                                          1/1     Running            2 (35m ago)      5h9m    10.200.60.104   ip-10-200-60-82.eu-west-1.compute.internal    <none>           <none>
secrets-store-csi-driver   csi-secrets-store-provider-aws-qzjzx                    1/1     Running            2 (35m ago)      5h9m    10.200.60.82    ip-10-200-60-82.eu-west-1.compute.internal    <none>           <none>
secrets-store-csi-driver   secrets-store-csi-driver-z5b67                          3/3     Running            6 (35m ago)      5h9m    10.200.60.235   ip-10-200-60-82.eu-west-1.compute.internal    <none>           <none>

Also bottlerocket-update-operator:

~ k get brs -n brupop-bottlerocket-aws brs-ip-10-200-60-82.eu-west-1.compute.internal
brs-ip-10-200-60-82.eu-west-1.compute.internal    Idle    1.9.1     Idle

Guessing these reboots are related to operator updating to the newest release.

Aug 29 '22 15:08 michalschott

Think I've got something. This is how it looks thanos:

[root@admin]# find .bottlerocket/rootfs/ -iname 'thanos' | grep bin 2>/dev/null
.bottlerocket/rootfs/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/362/fs/bin/thanos
[root@admin]# readelf -h .bottlerocket/rootfs/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/362/fs/bin/thanos
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x46c640
  Start of program headers:          64 (bytes into file)
  Start of section headers:          456 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         7
  Size of section headers:           64 (bytes)
  Number of section headers:         23
  Section header string table index: 3
[root@admin]# ls -lh .bottlerocket/rootfs/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/362/fs/bin/thanos
-rwxr-xr-x. 1 root root 66M Jul  5 14:59 .bottlerocket/rootfs/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/362/fs/bin/thanos

For prometheus:

[root@admin]# find .bottlerocket/rootfs/ -iname 'prometheus' | grep bin 2>/dev/null
.bottlerocket/rootfs/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/316/fs/bin/prometheus
[root@admin]# readelf -h .bottlerocket/rootfs/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/316/fs/bin/prometheus
readelf: .bottlerocket/rootfs/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/316/fs/bin/prometheus: Error: Failed to read file's magic number
[root@admin]# ls -lh .bottlerocket/rootfs/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/316/fs/bin/prometheus
-rwxr-xr-x. 1 root root 0 Jul 14 15:16 .bottlerocket/rootfs/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/316/fs/bin/prometheus

So indeed entrypoint is corrupted.

Aug 30 '22 07:08 michalschott

@mchaker any luck reproducing this issue? Or maybe an idea about corrupted entrypoints? Maybe it is worth to ping friends fron containerd project?

Sep 06 '22 13:09 michalschott

Think the issue might be too low disk attached to bottlerocket, since I've extended it this never happened 🤦

Sep 26 '22 15:09 michalschott

If the container layer fails to unpack because the filesystem is out of space, I suppose that could leave zero-sized files around.

I'm surprised that this wouldn't bubble up as an error from containerd, relayed to kubelet via CRI. But perhaps the partially unpacked layer might not get cleaned up and would then be reused by a later attempt to run the same pod.

In any case this seems like a good direction to investigate - if it's easier to repro with a nearly full disk, that will help with finding the root cause.

Sep 26 '22 15:09 bcressey

This happened to me today again for a container with imagePullPolicy: IfNotPresent with 50GB free disk space.

Dec 04 '22 19:12 michalschott

Not a lot to go on here. If/when this is hit again, can we capture some of the lsblk or other storage info, variant and version, etc. to start collecting data in this issue and see if we can start to see any kind of pattern?

Mar 30 '23 20:03 stmcginnis

I don't have any new information to share yet - but I will say that we've started seeing this over the last few weeks intermittently as well. We saw it mostly on Bottlerocket 1.13.1.. and we have upgraded to 1.14.1. When we see the issue next, we'll do some deep diving on it. Are there particular logs/things we should look into when it happens?

Jun 11 '23 15:06 diranged

We've just hit this, with some redis-sentinel containers. Have confirmed that the entrypoint.sh truncated to 0. Once it's in this state, restarting the container never redownloads the image, so it seems to be stuck. Has only started happening after we've upgraded from:

1.24 k8s -> 1.25 k8s
1.11.1 bottlerocket -> 1.12.0 bottlerocket

Aug 25 '23 14:08 mikebryant

bottlerocket bottlerocket copied to clipboard

Occasional "exec format error"

bottlerocket
bottlerocket copied to clipboard