Regularly see E0164 - timeout or abort while handling: method GET URI /apis/projectcalico.org/v3
Hi We regularly see these errors below on our kube-apiservers, that suggest something is not right with the aggregated v3 api's. However, we don't see any negative impact, calico is working fine, the apiserver in calico-apiserver namespace all seems fine.
We removed the custom resource for default APIServer and put it back on, no difference..
Should we be concerned?
Expected Behavior
No errors in logs - errors in kube-apiserver logs attract our attention quickly ;-)
Current Behavior
wrap.go:53] timeout or abort while handling: method=GET URI="/apis/projectcalico.org/v3/profiles?allowWatchBookmarks=true&resourceVersion=73417144%2F73417144&watch=true" audit-ID="c17b2ff6-c3a8-4698-a9b5-58b9dc6bf56a"
wrap.go:53] timeout or abort while handling: method=GET URI="/apis/projectcalico.org/v3/globalnetworksets?allowWatchBookmarks=true&resourceVersion=73417144&watch=true" audit-ID="e853832c-3508-48a7-b5a8-f3e787df9c94
wrap.go:53] timeout or abort while handling: method=GET URI="/apis/projectcalico.org/v3/hostendpoints?allowWatchBookmarks=true&resourceVersion=73417144&watch=true" audit-ID="31ef1042-a457-4510-997a-26cd306e0551"
wrap.go:53] timeout or abort while handling: method=GET URI="/apis/projectcalico.org/v3/bgppeers?allowWatchBookmarks=true&resourceVersion=73417144&watch=true" audit-ID="a4bd1b53-156b-4f62-b8d7-8413283d9a2b"
wrap.go:53] timeout or abort while handling: method=GET URI="/apis/projectcalico.org/v3/kubecontrollersconfigurations?allowWatchBookmarks=true&resourceVersion=73417144&watch=true" audit-ID="540af559-1cf8-47ed-9a03-6165a016901b"
wrap.go:53] timeout or abort while handling: method=GET URI="/apis/projectcalico.org/v3/ipamconfigurations?allowWatchBookmarks=true&resourceVersion=73417144&watch=true" audit-ID="d0ce56d5-209b-4241-8442-787ce6903d50"
wrap.go:53] timeout or abort while handling: method=GET URI="/apis/projectcalico.org/v3/clusterinformations?allowWatchBookmarks=true&resourceVersion=73417144&watch=true" audit-ID="f67a7af1-d9cf-4110-9c56-80ee4a87c545"
wrap.go:53] timeout or abort while handling: method=GET URI="/apis/projectcalico.org/v3/felixconfigurations?allowWatchBookmarks=true&resourceVersion=73417144&watch=true" audit-ID="bf03ef4f-73f4-4ae0-aeb4-30f8c1dacad4"
wrap.go:53] timeout or abort while handling: method=GET URI="/apis/projectcalico.org/v3/networksets?allowWatchBookmarks=true&resourceVersion=73417144&watch=true" audit-ID="5a855bef-247b-4214-8909-4eedf83d5ec1"
wrap.go:53] timeout or abort while handling: method=GET URI="/apis/projectcalico.org/v3/globalnetworkpolicies?allowWatchBookmarks=true&resourceVersion=73417144&watch=true" audit-ID="fd9c9329-8a19-4bce-a1b3-5863aefeb72e"
wrap.go:53] timeout or abort while handling: method=GET URI="/apis/projectcalico.org/v3/ippools?allowWatchBookmarks=true&resourceVersion=73417144&watch=true" audit-ID="f1972004-bc13-43c0-8f97-701fae5f1f91"
wrap.go:53] timeout or abort while handling: method=GET URI="/apis/projectcalico.org/v3/caliconodestatuses?allowWatchBookmarks=true&resourceVersion=73417144&watch=true" audit-ID="aa809b3f-23bd-41f9-9713-43a90270c736"
wrap.go:53] timeout or abort while handling: method=GET URI="/apis/projectcalico.org/v3/blockaffinities?allowWatchBookmarks=true&resourceVersion=73417144&watch=true" audit-ID="34ef13d1-658b-4141-bea3-4c0c5ab09a69"
wrap.go:53] timeout or abort while handling: method=GET URI="/apis/projectcalico.org/v3/networkpolicies?allowWatchBookmarks=true&resourceVersion=73417144&watch=true" audit-ID="ce6fe12c-039c-4005-a0f3-f6e0d46d7369"
wrap.go:53] timeout or abort while handling: method=GET URI="/apis/projectcalico.org/v3/bgpconfigurations?allowWatchBookmarks=true&resourceVersion=73417144&watch=true" audit-ID="9857ed1e-e75f-4c4b-87a1-c06210a0eea1"
wrap.go:53] timeout or abort while handling: method=GET URI="/apis/projectcalico.org/v3/ipreservations?allowWatchBookmarks=true&resourceVersion=73417144&watch=true" audit-ID="43db0f4e-0d78-42fe-b2ef-4442fed68325"
Steps to Reproduce (for bugs)
- Fresh K8s 1.25 Cluster
- Install tigera-operator
- Add default Installation
- Add default APIService
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata:
name: default
spec: {}
Your Environment
- Calico version 3.25.1, but have seen on earlier versions on other clusters..
- Orchestrator version (e.g. kubernetes, mesos, rkt): kubernetes 1.25
This might not necessarily be a problem - I believe watch=true requests have an inherent timeout after which they are terminated due to inactivity.
Does this happen roughly once per minute for each of those URIs? I believe that's the default timeout.
Hi @caseydavenport - every 10 minutes without fail.
It would be good to make those logs go away - I will see what can be done about that... but I think it's safe to say these are benign logs and essentially working as expected, since long-lived "watch" requests against the API server that receive no traffic will eventually be terminated and restarted by design.
Thanks Casey - that was our assumption as well and glad to have it confirmed. So not an actual error really 😉
We are seeing these as well in our kube-api-server logs. Is there a way to disable these? Cause they cause a massive amount of spam over time, and if they are benign why are they being logged as WARN or even ERROR sometimes?
Is there any good solution?
Hello there,
I'm actually having this issue, did any roundabout or roadmap exist for this ? My environment: Calico: 3.28.0 Kubernetes: 1.29.6
anything changed on this?
I believe @ti-afra is working on better logging and log level configuration for the API server, which may solve this.
I have the same issue :( every 10 minute
Same issue
E0217 07:27:48.840253 1 wrap.go:53] "Timeout or abort while handling" logger="UnhandledError" method="GET" URI="/apis/projectcalico.org/v3/tiers?allowWatchBookmarks=true&resourceVersion=40565403&watch=true" auditID="1ba02dd4-d0ca-4662-b2ea-3fe7f2c877e2" E0217 07:27:48.840310 1 wrap.go:53] "Timeout or abort while handling" logger="UnhandledError" method="GET" URI="/apis/projectcalico.org/v3/ipamconfigurations?allowWatchBookmarks=true&resourceVersion=40565402&watch=true" auditID="fcf771b1-d5bc-4cd9-b158-f861dd1ac969" E0217 07:27:48.842002 1 wrap.go:53] "Timeout or abort while handling" logger="UnhandledError" method="GET" URI="/apis/projectcalico.org/v3/blockaffinities?allowWatchBookmarks=true&resourceVersion=40565404&watch=true" auditID="b906cc08-c7b2-41a9-8a18-8b1e197729b6"
Filed an upstream bug; finding it very hard to track down the relevant code in k8s in order to figure out if we're doing something wrong here: https://github.com/kubernetes/kubernetes/issues/130410
I spun up a cluster with Calico API server to try to reproduce this
$ kubectl version
Client Version: v1.32.3
Kustomize Version: v5.5.0
Server Version: v1.31.7
calicoVersion: v3.29.3
After spinning up the cluster and leaving it sat idle I do not see that error. However, if I
- Watch a resource with
kubectl get profiles.projectcalico.org -w - Ctrl-C that watch.
Then I get the error immediately.
E0407 10:32:19.006721 1 wrap.go:53] "Timeout or abort while handling" logger="UnhandledError" method="GET" URI="/apis/projectcalico.org/v3/profiles?resourceVersion=6226%2F6226&watch=true" auditID="d91ea927-9dd9-44b8-a5fd-c2f9f4fa6beb"
I did not see the error after leaving kubectl watching the resource for >10min. I only see the error when I terminate the watch request from the client side.
@mikementzmaersk @Anton-Kartashev @tgajdo @davhdavh @urcus do you have anything in your cluster that might be watching calico resources with a 10-minute timeout? Calico itself generally watches our internal CRD resources so it'd likely be something external. For example, perhaps you're running Argo or some other component that monitors everything in the cluster.
Also, what type of cluster is this EKS? AKS? RKE? MKE? .... I think the problem is triggered by something outside Calico and k8s. Would like to confirm that.
If you can crank the kube-apiserver's debug level up to 7 then I think it'd log who was making the requests. That can be done with the -v=7 flag when running the API server. Note: this is performance impacting so don't do it in prod.
Also, what type of cluster is this EKS? AKS? RKE? MKE?
Self-hosted kubeadm based. Version 1.32.3. Can't access to calico api via curl from master node:
curl -k -H "Authorization: Bearer [redacted]" "https://[redacted]:6443/apis/projectcalico.org/v3/kubecontrollersconfigurations?allowWatchBookmarks=true&resourceVersion=62769906&watch=true"
got timeout
curl https://[redacted]:6443/ works fine and show calico api's in the list:
"/apis/projectcalico.org", "/apis/projectcalico.org/v3",
Yep, watch is a long-running operation so curl will time out. Are you running Argo or some other CI/CD app in your cluster?
I'm looking for an app that is watching "everything" and causing these timeouts.
Yep, watch is a long-running operation so curl will time out. Are you running Argo or some other CI/CD app in your cluster?
I'm looking for an app that is watching "everything" and causing these timeouts.
Yes, have argocd
Khm, I have omit &watch=true and connection was successful and produces some output. So, this is not an timeout, k8s api wait for a new event. Btw, calico-apiserver SA can't list that endpoint:
"message": "kubecontrollersconfigurations.projectcalico.org is forbidden: User "system:serviceaccount:calico-apiserver:calico-apiserver" cannot list resource "kubecontrollersconfigurations" in API group "projectcalico.org" at the cluster scope",
Self-hosted cluster. Uses argocd.
I've submitted a patch upstream to fix the issue: https://github.com/kubernetes/kubernetes/pull/131245
It comes down to the proxying logic in the API server raising a particular error that is normally only raised for timed-out non-watch requests. That gets logged because, for those requests it indicates an overload. However, for a watch request it typically means "client closed connection".
@Anton-Kartashev @davhdavh is there a way that you can confirm that Argo is causing the timeouts (perhaps disable Argo for 60s to see if the errors stop)? I think there are two bugs here:
- Upstream bug in k8s API server making the spammy log.
- Bug in whatever is timing out after 10s (perhaps Argo). Watches should last a lot longer than that; 5 minutes +100% jitter is the default. If the request times out server-side (instead of client closing connection early) then the error doesn't occur.
@fasaxc hello, confirm this. After disabling argocd pods I don't see any errors in kube-apiserver pod. Investigate this.
Thanks for the update @Anton-Kartashev ; that suggests that the version of Argo that you're using has a bug, where it is timing out watch requests every 10s at the client side. For normal (not aggregated API) resources, the API server doesn't log anything when that happens but for aggregated resources you get this error.
Just letting you know, we are facing the same log spam here.
Filtering our logs for ipamconfigurations, we get about 75 entries per hour:
E0930 18:07:50.541118 1 wrap.go:53] "Timeout or abort while handling" logger="UnhandledError" method="GET" URI="/apis/projectcalico.org/v3/ipamconfigurations?allowWatchBookmarks=true&resourceVersion=43043378&watch=true" auditID="c90b8684-fd1f-4086-88b6-350942100342"
Filtering for projectcalico.org (all CRDs), we get about 1k hits per hour.
The Cluster generating this has about 200 Nodes.