amazon-eks-ami icon indicating copy to clipboard operation
amazon-eks-ami copied to clipboard

containerd image pulling performance

Open maksim-paskal opened this issue 2 years ago • 0 comments

We have trouble with running our EKS clusters with containerd container runtime. When pods with huge images start creating at the same time on the same node - kubernetes nodes become unstable, all running pods on this node - freeze.

To reproduce:

  1. you need image with huge amount of small files, or use my test image paskalmaksim/stress-test-pulling-big-image:200k
FROM ubuntu:latest

WORKDIR /app

RUN seq -w 1 200000 | xargs -n1 -P8 -I% sh -c 'dd if=/dev/urandom of=file.% bs=$(shuf -i1-5 -n1) count=1024' 2> /dev/null
  1. you need a kubernetes node, our scenario: (m5.xlarge, 50GB, gp3, 3000iops, 125 throughput) . Drain node
kubectl drain <node name> --ignore-daemonsets --delete-emptydir-data
  1. run pods with test image on this node
apiVersion: apps/v1
kind: Deployment
metadata:
  name: stress-test-pulling-big-image
  labels:
    app: stress-test-pulling-big-image
spec:
  strategy:
    type: Recreate
  replicas: 30
  selector:
    matchLabels:
      app: stress-test-pulling-big-image
  template:
    metadata:
      labels:
        app: stress-test-pulling-big-image
    spec:
      nodeSelector:
        kubernetes.io/hostname: <node name>
      tolerations:
      - effect: "NoSchedule"
        operator: "Exists"
      containers:
      - name: stress-test-pulling-big-image
        image: docker.io/paskalmaksim/stress-test-pulling-big-image:200k
        imagePullPolicy: Always
        resources:
          limits:
            cpu: 10m
            memory: 10Mi
        command:
        - sleep
        - 1d

CPU and disk iops 100%

Environment: kubernetes v1.22.9-eks-810597c kernel 5.4.204-113.362.amzn2.x86_64 containerd://1.4.13

maksim-paskal avatar Aug 12 '22 07:08 maksim-paskal