amazon-eks-ami
amazon-eks-ami copied to clipboard
containerd image pulling performance
We have trouble with running our EKS clusters with containerd container runtime. When pods with huge images start creating at the same time on the same node - kubernetes nodes become unstable, all running pods on this node - freeze.
To reproduce:
- you need image with huge amount of small files, or use my test image
paskalmaksim/stress-test-pulling-big-image:200k
FROM ubuntu:latest
WORKDIR /app
RUN seq -w 1 200000 | xargs -n1 -P8 -I% sh -c 'dd if=/dev/urandom of=file.% bs=$(shuf -i1-5 -n1) count=1024' 2> /dev/null
- you need a kubernetes node, our scenario: (m5.xlarge, 50GB, gp3, 3000iops, 125 throughput) . Drain node
kubectl drain <node name> --ignore-daemonsets --delete-emptydir-data
- run pods with test image on this node
apiVersion: apps/v1
kind: Deployment
metadata:
name: stress-test-pulling-big-image
labels:
app: stress-test-pulling-big-image
spec:
strategy:
type: Recreate
replicas: 30
selector:
matchLabels:
app: stress-test-pulling-big-image
template:
metadata:
labels:
app: stress-test-pulling-big-image
spec:
nodeSelector:
kubernetes.io/hostname: <node name>
tolerations:
- effect: "NoSchedule"
operator: "Exists"
containers:
- name: stress-test-pulling-big-image
image: docker.io/paskalmaksim/stress-test-pulling-big-image:200k
imagePullPolicy: Always
resources:
limits:
cpu: 10m
memory: 10Mi
command:
- sleep
- 1d
CPU and disk iops 100%
Environment: kubernetes v1.22.9-eks-810597c kernel 5.4.204-113.362.amzn2.x86_64 containerd://1.4.13