Cluster status error since 0.33.9 with eks cluster
Expected Behavior
Tilt should be able to connect to the cluster on tilt up etc.
Current Behavior
Tilt is unable to connect to the cluster directly. We still see tilt managing local_resources and our Tiltfile executes some kubectl commands manually via local or local_resource, but the managed k8s resources behind a helm_resource do not work. In addition after the Tiltfile processing finishes there's a noted failure on the (Tiltfile) resource.
Successfully loaded Tiltfile (1m14.658932792s)
Cluster status error: Tilt encountered an error connecting to your Kubernetes cluster:
Get "[https://<redacted>.gr7.us-east-1.eks.amazonaws.com/version?timeout=32s":](https://<redacted>.gr7.us-east-1.eks.amazonaws.com/version?timeout=32s%22:) context deadline exceeded
You will need to restart Tilt after resolving the issue.
We have tested and in 0.33.8 this works without such issue, and I tested with 0.33.15 and the issue since 0.33.9 still persists.
Steps to Reproduce
- Configure an eks cluster and authenticate against it
- Run
tilt up - Wait for resources to load, but then tilt cannot connect to the cluster even while
kubectlcommands from inside alocalorlocal_resourceresources work
Context
tilt doctor Output
$ tilt doctor
Tilt: v0.33.15, built 2024-05-31
System: darwin-arm64
---
Docker
- Host: unix:///Users/<me>/.docker/run/docker.sock
- Server Version: 26.1.1
- API Version: 1.45
- Builder: 2
- Compose Version: v2.27.0-desktop.2
---
Kubernetes
- Env: eks
- Context: kubernetes-eks-dev
- Cluster Name: arn:aws:eks:us-east-1:<redacted-eks-arn-id>:cluster/kubernetes-eks-dev
- Namespace: default
- Container Runtime: containerd
- Version: v1.27.13-eks-3af4770
- Cluster Local Registry: none
---
Thanks for seeing the Tilt Doctor!
Please send the info above when filing bug reports. 💗
The info below helps us understand how you're using Tilt so we can improve,
but is not required to ask for help.
---
Analytics Settings
--> (These results reflect your personal opt in/out status and may be overridden by an `analytics_settings` call in your Tiltfile)
- User Mode: opt-in
- Machine: b8542883618c2effbdb7c7ceed78623b
- Repo: dqZ55OF3HaxcqT2x/Y9LwQ==
# relevant .kube/config
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: <redacted>
server: https://<redacted>.gr7.us-east-1.eks.amazonaws.com
name: arn:aws:eks:us-east-1:<redacted-eks-arn-id>:cluster/kubernetes-eks-dev
contexts:
- context:
cluster: arn:aws:eks:us-east-1:<redacted-eks-arn-id>:cluster/kubernetes-eks-dev
user: arn:aws:eks:us-east-1:<redacted-eks-arn-id>:cluster/kubernetes-eks-dev
name: kubernetes-eks-dev
current-context: kubernetes-eks-dev
kind: Config
preferences: {}
users:
- name: arn:aws:eks:us-east-1:<redacted-eks-arn-id>:cluster/kubernetes-eks-dev
user:
exec:
apiVersion: client.authentication.k8s.io/v1beta1
args:
- --region
- us-east-1
- eks
- get-token
- --cluster-name
- kubernetes-eks-dev
- --output
- json
command: aws
env:
- name: AWS_PROFILE
value: <my-profile-name>
About Your Use Case
This has been happening since 0.33.9 and I forgot to report it right away. This still happens on 0.33.15. For now we've actually added a check in our Tiltfile to force people on to <=0.33.8, until this can be resolved. Maybe it's specific to Amazon EKS's authentication, but I'm not sure.
Hmmm...I tried this with my own EKS cluster, and was not able to repro.
I went through all the changes between 0.33.8 and 0.33.9 and didn't see any changes that would affect how tilt computes cluster status.
can you post the output of:
kubectl get -v=6 --raw /version
?
Sure thing
$ kubectl get -v=6 --raw /version
I0607 11:02:35.245274 24199 loader.go:374] Config loaded from file: /Users/briankleszyk/.kube/config
I0607 11:02:35.953061 24199 round_trippers.go:553] GET https://<redacted>.gr7.us-east-1.eks.amazonaws.com/version 200 OK in 706 milliseconds
{
"major": "1",
"minor": "27+",
"gitVersion": "v1.27.13-eks-3af4770",
"gitCommit": "4873544ec1ec7d3713084677caa6cf51f3b1ca6f",
"gitTreeState": "clean",
"buildDate": "2024-04-30T03:31:44Z",
"goVersion": "go1.21.9",
"compiler": "gc",
"platform": "linux/amd64"
}
🤷 don't know if relevant or not but I (and most of our engineers) are using arm64, with a amd64 cluster.
I am also experiencing this issue with my eks cluster running k8s 1.32 and tilt 0.33.22
k8s:
$ kubectl get -v=6 --raw /version
{
"major": "1",
"minor": "32",
"gitVersion": "v1.32.2-eks-bc803b4",
"gitCommit": "ba544f1e7adc98f5a0a09cd98bf2c091572a701c",
"gitTreeState": "clean",
"buildDate": "2025-02-17T20:41:12Z",
"goVersion": "go1.23.6",
"compiler": "gc",
"platform": "linux/amd64"
}%
tilt:
$ tilt version
v0.33.22, built 2025-01-03
credentials helper is the same:
- name: arn:aws:eks:us-east-1:<acct>:cluster/<name>
user:
exec:
apiVersion: client.authentication.k8s.io/v1beta1
args:
- --region
- us-east-1
- eks
- get-token
- --cluster-name
- <name>
- --output
- json
command: aws
env: null
interactiveMode: IfAvailable
provideClusterInfo: false