autoscaler icon indicating copy to clipboard operation
autoscaler copied to clipboard

Helm install on EKS ends in CrashLoopBackOff without clear error message

Open buettner123 opened this issue 3 years ago • 0 comments

Which component are you using?: cluster-autoscaler installed with helm chart

What version of the component are you using?:

  • cluster-autoscaler 1.23.0
  • helm chart version 9.19.3 (also tried with 9.18.1)

│ Containers: │ │ aws-cluster-autoscaler: │ │ Container ID: docker://3c28b997f44070f5a01ff85a0f566f30fd7fcd23c4873bad3b5059be6be44faf │ │ Image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.23.0 │ │ Image ID: docker-pullable://k8s.gcr.io/autoscaling/cluster-autoscaler@sha256:f46687231c2c1bfa139f2b18275b123222c8cf6a288bb9c8145932bd14ac3deb

Component version:

What k8s version are you using (kubectl version)?:

kubectl version Output
$ kubectl version
Server Version: version.Info{Major:"1", Minor:"23+", GitVersion:"v1.23.7-eks-4721010", GitCommit:"b77d9473a02fbfa834afa67d677fd12d690b195f", GitTreeState:"clean", BuildDate:"2022-06-27T22:19:07Z", GoVersion:"go1.17.10", Compiler:"gc", Platform:"linux/amd64"}

What environment is this in?: AWS EKS

What did you expect to happen?: I expected cluster-autoscaler to start successfully and discover the autoscaling group for my cluster

What happened instead?: The pod is in a CrashLoopBackOff cycle always failing on startup.

The following exception happens on each start of the pod:
I0816 12:00:36.501922       1 reflector.go:255] Listing and watching *v1.ReplicaSet from k8s.io/client-go/informers/factory.go:134                                                                                                                         │
│ I0816 12:00:36.456793       1 reflector.go:219] Starting reflector *v1.Pod (0s) from k8s.io/client-go/informers/factory.go:134                                                                                                                             │
│ I0816 12:00:36.502308       1 reflector.go:255] Listing and watching *v1.Pod from k8s.io/client-go/informers/factory.go:134                                                                                                                                │
│ I0816 12:00:36.456936       1 reflector.go:219] Starting reflector *v1beta1.CSIStorageCapacity (0s) from k8s.io/client-go/informers/factory.go:134                                                                                                         │
│ I0816 12:00:36.502560       1 reflector.go:255] Listing and watching *v1beta1.CSIStorageCapacity from k8s.io/client-go/informers/factory.go:134                                                                                                            │
│ I0816 12:00:36.457096       1 reflector.go:219] Starting reflector *v1.StorageClass (0s) from k8s.io/client-go/informers/factory.go:134                                                                                                                    │
│ I0816 12:00:36.502958       1 reflector.go:255] Listing and watching *v1.StorageClass from k8s.io/client-go/informers/factory.go:134                                                                                                                       │
│ I0816 12:00:36.457257       1 reflector.go:219] Starting reflector *v1.Service (0s) from k8s.io/client-go/informers/factory.go:134                                                                                                                         │
│ I0816 12:00:36.503790       1 reflector.go:255] Listing and watching *v1.Service from k8s.io/client-go/informers/factory.go:134                                                                                                                            │
│ W0816 12:00:36.513521       1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget                                                                                         │
│ I0816 12:00:36.625402       1 request.go:597] Waited for 123.619964ms due to client-side throttling, not priority and fairness, request: GET:https://172.20.0.1:443/api/v1/persistentvolumeclaims?limit=500&resourceVersion=0                              │
│ I0816 12:00:36.825444       1 request.go:597] Waited for 322.959031ms due to client-side throttling, not priority and fairness, request: GET:https://172.20.0.1:443/api/v1/nodes?limit=500&resourceVersion=0                                               │
│ I0816 12:00:37.025525       1 request.go:597] Waited for 522.241959ms due to client-side throttling, not priority and fairness, request: GET:https://172.20.0.1:443/api/v1/pods?limit=500&resourceVersion=0                                                │
│ I0816 12:00:37.225218       1 request.go:597] Waited for 721.258ms due to client-side throttling, not priority and fairness, request: GET:https://172.20.0.1:443/api/v1/services?limit=500&resourceVersion=0                                               │
│ goroutine 341 [select]:                                                                                                                                                                                                                                    │
│ net/http.(*persistConn).writeLoop(0xc000b8c5a0)                                                                                                                                                                                                            │
│     /usr/local/go/src/net/http/transport.go:2386 +0xfb                                                                                                                                                                                                     │
│ created by net/http.(*Transport).dialConn                                                                                                                                                                                                                  │
│     /usr/local/go/src/net/http/transport.go:1748 +0x1e65                                                                                                                                                                                                   │
│                                                                                                                                                                                                                                                            │
│ goroutine 371 [IO wait]:                                                                                                                                                                                                                                   │
│ internal/poll.runtime_pollWait(0x7f7299fa50a8, 0x72)                                                                                                                                                                                                       │
│     /usr/local/go/src/runtime/netpoll.go:234 +0x89                                                                                                                                                                                                         │
│ internal/poll.(*pollDesc).wait(0xc00012da00, 0xc00079f300, 0x0)                                                                                                                                                                                            │
│     /usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32                                                                                                                                                                                            │
│ internal/poll.(*pollDesc).waitRead(...)                                                                                                                                                                                                                    │
│     /usr/local/go/src/internal/poll/fd_poll_runtime.go:89                                                                                                                                                                                                  │
│ internal/poll.(*FD).Read(0xc00012da00, {0xc00079f300, 0x191e, 0x191e})                                                                                                                                                                                     │
│     /usr/local/go/src/internal/poll/fd_unix.go:167 +0x25a                                                                                                                                                                                                  │
│ net.(*netFD).Read(0xc00012da00, {0xc00079f300, 0xc00079f30d, 0xb4})                                                                                                                                                                                        │
│     /usr/local/go/src/net/fd_posix.go:56 +0x29                                                                                                                                                                                                             │
│ net.(*conn).Read(0xc00084db40, {0xc00079f300, 0x1911, 0xc0009517f8})                                                                                                                                                                                       │
│     /usr/local/go/src/net/net.go:183 +0x45                                                                                                                                                                                                                 │
│ crypto/tls.(*atLeastReader).Read(0xc001dee540, {0xc00079f300, 0x0, 0x409b6d})                                                                                                                                                                              │
│     /usr/local/go/src/crypto/tls/conn.go:777 +0x3d                                                                                                                                                                                                         │
│ bytes.(*Buffer).ReadFrom(0xc000db8278, {0x41a8100, 0xc001dee540})                                                                                                                                                                                          │
│     /usr/local/go/src/bytes/buffer.go:204 +0x98                                                                                                                                                                                                            │
│ crypto/tls.(*Conn).readFromUntil(0xc000db8000, {0x41ad760, 0xc00084db40}, 0x191e)                                                                                                                                                                          │
│     /usr/local/go/src/crypto/tls/conn.go:799 +0xe5                                                                                                                                                                                                         │
│ crypto/tls.(*Conn).readRecordOrCCS(0xc000db8000, 0x0)
|     /usr/local/go/src/crypto/tls/conn.go:606 +0x112                                                                                                                                                                                                        │
│ crypto/tls.(*Conn).readRecord(...)                                                                                                                                                                                                                         │
│     /usr/local/go/src/crypto/tls/conn.go:574                                                                                                                                                                                                               │
│ crypto/tls.(*Conn).Read(0xc000db8000, {0xc0001ac000, 0x1000, 0x1})                                                                                                                                                                                         │
│     /usr/local/go/src/crypto/tls/conn.go:1277 +0x16f                                                                                                                                                                                                       │
│ net/http.(*persistConn).Read(0xc0015e6a20, {0xc0001ac000, 0xc000da1200, 0xc000951d30})                                                                                                                                                                     │
│     /usr/local/go/src/net/http/transport.go:1926 +0x4e                                                                                                                                                                                                     │
│ bufio.(*Reader).fill(0xc000dbad80)                                                                                                                                                                                                                         │
│     /usr/local/go/src/bufio/bufio.go:101 +0x103                                                                                                                                                                                                            │
│ bufio.(*Reader).Peek(0xc000dbad80, 0x1)                                                                                                                                                                                                                    │
│     /usr/local/go/src/bufio/bufio.go:139 +0x5d                                                                                                                                                                                                             │
│ net/http.(*persistConn).readLoop(0xc0015e6a20)                                                                                                                                                                                                             │
│     /usr/local/go/src/net/http/transport.go:2087 +0x1ac                                                                                                                                                                                                    │
│ created by net/http.(*Transport).dialConn                                                                                                                                                                                                                  │
│     /usr/local/go/src/net/http/transport.go:1747 +0x1e05                                                                                                                                                                                                   │
│                                                                                                                                                                                                                                                            │
│ goroutine 335 [sync.Cond.Wait]:                                                                                                                                                                                                                            │
│ sync.runtime_notifyListWait(0xc000870300, 0x0)                                                                                                                                                                                                             │
│     /usr/local/go/src/runtime/sema.go:513 +0x13d                                                                                                                                                                                                           │
│ sync.(*Cond).Wait(0xc0007da4e0)                                                                                                                                                                                                                            │
│     /usr/local/go/src/sync/cond.go:56 +0x8c                                                                                                                                                                                                                │
│ golang.org/x/net/http2.(*pipe).Read(0xc0008702e8, {0xc00037c800, 0x200, 0x200})                                                                                                                                                                            │
│     /gopath/src/k8s.io/autoscaler/cluster-autoscaler/vendor/golang.org/x/net/http2/pipe.go:65 +0xeb                                                                                                                                                        │
│ golang.org/x/net/http2.transportResponseBody.Read({0x40cdde}, {0xc00037c800, 0xd0, 0xc0000ed4b0})                                                                                                                                                          │
│     /gopath/src/k8s.io/autoscaler/cluster-autoscaler/vendor/golang.org/x/net/http2/transport.go:2104 +0x77                                                                                                                                                 │
│ encoding/json.(*Decoder).refill(0xc0002a4640)                                                                                                                                                                                                              │
│     /usr/local/go/src/encoding/json/stream.go:165 +0x17f                                                                                                                                                                                                   │
│ encoding/json.(*Decoder).readValue(0xc0002a4640)                                                                                                                                                                                                           │
│     /usr/local/go/src/encoding/json/stream.go:140 +0xbb                                                                                                                                                                                                    │
│ encoding/json.(*Decoder).Decode(0xc0002a4640, {0x350a880, 0xc000b5ce70})                                                                                                                                                                                   │
│     /usr/local/go/src/encoding/json/stream.go:63 +0x78                                                                                                                                                                                                     │
│ k8s.io/apimachinery/pkg/util/framer.(*jsonFrameReader).Read(0xc0007f16e0, {0xc000686c00, 0x400, 0x400})                                                                                                                                                    │
│     /gopath/src/k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/apimachinery/pkg/util/framer/framer.go:152 +0x19c                                                                                                                                       │
│ k8s.io/apimachinery/pkg/runtime/serializer/streaming.(*decoder).Decode(0xc000dbc8c0, 0x0, {0x420a720, 0xc0008614c0})                                                                                                                                       │
│     /gopath/src/k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/apimachinery/pkg/runtime/serializer/streaming/streaming.go:77 +0xa7                                                                                                                     │
│ k8s.io/client-go/rest/watch.(*Decoder).Decode(0xc00087f0e0)                                                                                                                                                                                                │
│     /gopath/src/k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/rest/watch/decoder.go:49 +0x4f
│ k8s.io/apimachinery/pkg/watch.(*StreamWatcher).receive(0xc000861480)                                                                                                                                                                                       │
│     /gopath/src/k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:105 +0x11c                                                                                                                                      │
│ created by k8s.io/apimachinery/pkg/watch.NewStreamWatcher                                                                                                                                                                                                  │
│     /gopath/src/k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:76 +0x135                                                                                                                                       │
│                                                                                                                                                                                                                                                            │
│ goroutine 355 [sync.Cond.Wait]:                                                                                                                                                                                                                            │
│ sync.runtime_notifyListWait(0xc0015c3e80, 0x0)                                                                                                                                                                                                             │
│     /usr/local/go/src/runtime/sema.go:513 +0x13d                                                                                                                                                                                                           │
│ sync.(*Cond).Wait(0x10)                                                                                                                                                                                                                                    │
│     /usr/local/go/src/sync/cond.go:56 +0x8c                                                                                                                                                                                                                │
│ golang.org/x/net/http2.(*pipe).Read(0xc0015c3e68, {0xc00086e800, 0x200, 0x200})                                                                                                                                                                            │
│     /gopath/src/k8s.io/autoscaler/cluster-autoscaler/vendor/golang.org/x/net/http2/pipe.go:65 +0xeb                                                                                                                                                        │
│ golang.org/x/net/http2.transportResponseBody.Read({0x1}, {0xc00086e800, 0x0, 0xc0008decb0})                                                                                                                                                                │
│     /gopath/src/k8s.io/autoscaler/cluster-autoscaler/vendor/golang.org/x/net/http2/transport.go:2104 +0x77                                                                                                                                                 │
│ encoding/json.(*Decoder).refill(0xc001591180)                                                                                                                                                                                                              │
│     /usr/local/go/src/encoding/json/stream.go:165 +0x17f                                                                                                                                                                                                   │
│ encoding/json.(*Decoder).readValue(0xc001591180)                                                                                                                                                                                                           │
│     /usr/local/go/src/encoding/json/stream.go:140 +0xbb                                                                                                                                                                                                    │
│ encoding/json.(*Decoder).Decode(0xc001591180, {0x350a880, 0xc0017defc0})                                                                                                                                                                                   │
│     /usr/local/go/src/encoding/json/stream.go:63 +0x78                                                                                                                                                                                                     │
│ k8s.io/apimachinery/pkg/util/framer.(*jsonFrameReader).Read(0xc0017e8060, {0xc00056a800, 0x400, 0x400})                                                                                                                                                    │
│     /gopath/src/k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/apimachinery/pkg/util/framer/framer.go:152 +0x19c                                                                                                                                       │
│ k8s.io/apimachinery/pkg/runtime/serializer/streaming.(*decoder).Decode(0xc0015b4500, 0x3, {0x420a720, 0xc0017e6c80})                                                                                                                                       │
│     /gopath/src/k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/apimachinery/pkg/runtime/serializer/streaming/streaming.go:77 +0xa7                                                                                                                     │
│ k8s.io/client-go/rest/watch.(*Decoder).Decode(0xc0012ed4a0)                                                                                                                                                                                                │
│     /gopath/src/k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/client-go/rest/watch/decoder.go:49 +0x4f                                                                                                                                                │
│ k8s.io/apimachinery/pkg/watch.(*StreamWatcher).receive(0xc0017e6c40)                                                                                                                                                                                       │
│     /gopath/src/k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:105 +0x11c                                                                                                                                      │
│ created by k8s.io/apimachinery/pkg/watch.NewStreamWatcher                                                                                                                                                                                                  │
│     /gopath/src/k8s.io/autoscaler/cluster-autoscaler/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:76 +0x135                                                                                                                                       │
│                                                                                                                                                                                                                                                            │
│ goroutine 372 [select]:                                                                                                                                                                                                                                    │
│ net/http.(*persistConn).writeLoop(0xc0015e6a20)                                                                                                                                                                                                            │
│     /usr/local/go/src/net/http/transport.go:2386 +0xfb                                                                                                                                                                                                     │
│ created by net/http.(*Transport).dialConn                                                                                                                                                                                                                  │
│     /usr/local/go/src/net/http/transport.go:1748 +0x1e65                                                                                                                                                                                                   │
│ Stream closed EOF for kube-system/cluster-autoscaler-aws-cluster-autoscaler-d55c89b9f-s9dtd (aws-cluster-autoscaler)

How to reproduce it (as minimally and precisely as possible):

  • setup an eks cluster with the latest kubernetes version
  • create the policy, role and trust relationship like described in the documentation https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md
  • install the cluster autoscaler using helm:
helm install cluster-autoscaler autoscaler/cluster-autoscaler \
  -n kube-system \
  --version v9.19.3 \
  --set autoDiscovery.clusterName=my-cluster \
  --set awsRegion=eu-central-1 \
  --set rbac.serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=arn:aws:iam::xxxxxxx:role/ClusterAutoscalerIAMRole \
  --set rbac.serviceAccount.name=aws-cluster-autoscaler
  • wait for the pod to start

Anything else we need to know?: I'm using terraform for the whole setup of the roles etc. The created roles/policies look as follows:

ClusterAutoscalerIAMPolicy:
{
    "Statement": [
        {
            "Action": [
                "autoscaling:DescribeAutoScalingGroups",
                "autoscaling:DescribeAutoScalingInstances",
                "autoscaling:DescribeLaunchConfigurations",
                "autoscaling:DescribeScalingActivities",
                "ec2:DescribeInstanceTypes",
                "ec2:DescribeLaunchTemplateVersions"
            ],
            "Effect": "Allow",
            "Resource": [
                "*"
            ]
        },
        {
            "Action": [
                "autoscaling:SetDesiredCapacity",
                "autoscaling:TerminateInstanceInAutoScalingGroup",
                "ec2:DescribeImages",
                "ec2:DescribeInstanceTypes",
                "ec2:GetInstanceTypesFromInstanceRequirements",
                "eks:DescribeNodegroup"
            ],
            "Effect": "Allow",
            "Resource": [
                "*"
            ]
        }
    ],
    "Version": "2012-10-17"
}

ClusterAutoscalerIAMRole trust relationship (with the policy ClusterAutoscalerIAMPolicy attached to it):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "",
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::xxxxxxxxx:oidc-provider/oidc.eks.eu-central-1.amazonaws.com/id/xxxxxxxxx"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringEquals": {
                    "oidc.eks.eu-central-1.amazonaws.com/id/xxxxxxxxx:sub": "system:serviceaccount:kube-system:aws-cluster-autoscaler"
                }
            }
        }
    ]
}

The service account created by the helm install has the correct role annotated to it:

Name:                aws-cluster-autoscaler
Namespace:           kube-system
Labels:              app.kubernetes.io/instance=cluster-autoscaler
                     app.kubernetes.io/managed-by=Helm
                     app.kubernetes.io/name=aws-cluster-autoscaler
                     helm.sh/chart=cluster-autoscaler-9.19.3
Annotations:         eks.amazonaws.com/role-arn: arn:aws:iam::xxxxxxx:role/ClusterAutoscalerIAMRole
                     meta.helm.sh/release-name: cluster-autoscaler
                     meta.helm.sh/release-namespace: kube-system
Image pull secrets:  
Mountable secrets:   aws-cluster-autoscaler-token-bhnhv
Tokens:              aws-cluster-autoscaler-token-bhnhv
Events:              

If i miss some required configuration or you need any more details, please let me know.

buettner123 avatar Aug 16 '22 12:08 buettner123