unknown error connecting to the kubernetes cluster
Raboo in the slack channel reports this error message:
Tilt started on http://localhost:10350/
v0.30.7, built 2022-08-12
Tilt analytics disabled: Environment variable CI=true
Error fetching nodes: userextras.authentication.k8s.io "local://u-tkf5z" is forbidden: User "system:serviceaccount:cattle-impersonation-system:cattle-impersonation-u-tkf5z" cannot impersonate resource "userextras/principalid" in API group "authentication.k8s.io" at the cluster scope
Tilt could not read your node configuration
Ask your Kubernetes admin for access to run `kubectl get nodes`.
Detail: userextras.authentication.k8s.io "local://u-tkf5z" is forbidden: User "system:serviceaccount:cattle-impersonation-system:cattle-impersonation-u-tkf5z" cannot impersonate resource "userextras/principalid" in API group "authentication.k8s.io" at the cluster scope
Serving embedded Tilt production web assets
Problem processing change. Subscriber: engine/configs.TriggerQueueSubscriber. Backing off 1s. Error: the cache is not started, can not read objects
Initial Build
Loading Tiltfile at: /home/runner/work/httpbin-k8s/httpbin-k8s/Tiltfile
Successfully loaded Tiltfile (14.263048ms)
Problem processing change. Subscriber: engine/uiresource.Subscriber. Backing off 1s. Error: Operation cannot be fulfilled on uiresources.tilt.dev "uncategorized": the object has been modified; please apply your changes to the latest version and try again
ERROR: Cluster status error: Tilt encountered an error connecting to your Kubernetes cluster:
unknown
You will need to restart Tilt after resolving the issue.
Problem processing change. Subscriber: engine/uiresource.Subscriber. Backing off 1s. Error: Operation cannot be fulfilled on uiresources.tilt.dev "uncategorized": the object has been modified; please apply your changes to the latest version and try again
The part that jumps out at me is:
Tilt encountered an error connecting to your Kubernetes cluster:
unknown
Tilt should do a better job explaining what failed and how you can replicate the connection error.
i included the full error output for completeness but i think the "kubectl get nodes" error message is a red herring and unrelated to the real issue.
When setting up the K8s client, there's a call to CheckConnected(), if this fails, the error gets propagated through to the cluster status.
https://github.com/tilt-dev/tilt/blob/ab3e996814eb1568f70727dfee73597dd493f2bc/internal/controllers/core/cluster/client.go#L62
https://github.com/tilt-dev/tilt/blob/ab3e996814eb1568f70727dfee73597dd493f2bc/internal/controllers/core/cluster/reconciler.go#L153-L161
(Note that clusterRefreshEnabled will always be false - not all parts of Tilt use the Cluster reconciler K8s client instance, so that flag is currently for development while working on migration of code but not usable otherwise.)
AFAICT we don't do anything super weird in CheckConnected:
https://github.com/tilt-dev/tilt/blob/ab3e996814eb1568f70727dfee73597dd493f2bc/internal/k8s/client.go#L320-L344
So I think that's probably coming back from the call to /version - I'm guessing the auth issue is not getting handled particularly nicely by the low-level K8s client code and resulting in that unknown?
I can add that everything did work before. So this is something that has happened in recent versions, not sure after which versions this started happen.
I can also say that it's probably related to limited permissions. The error happens in github actions workflow where the kubernetes user only have needed permissions to apply the resources to k8s. For instance it's not allowed to list all nodes in the cluster since I see no need for it in a CI user.
I don't have the same problem running Tilt with a full privilege user.
Hmm, I downgraded Tilt to v0.29.0 and v0.28.1 and still got the same problem.. Still very annoying that it just says unknown error. I did upgrade k8s in June or July from 1.19 or 1.20 (don't remember) to v1.22.9.
Are you running a cluster locally using minikube? if so, make sure to start your cluster with this command: minikube start
Hmm, this was a long time ago and I no longer work for the same company. But problem with the login was that a session token limit in Rancher or something for our CI user was recached. I changed settings in Rancher to reuse tokens or whatever was the limit instead of generating new ones.
However a big issue was the I didn't get a proper error in Tilt, making it hard to understand the issue. However, I don't work there, can't replicate, and this issue is years old. So please feel free to close this.