feat: Drop unnecessary listing for the sake of watch reinitialization
This change addresses the performance issue existing in the cluster cache described in https://github.com/argoproj/argo-cd/issues/18838.
kube-apiserver logs for the Pods resource (supposed super-low latency logged for the WATCH requests is due to a bug in Kubernetes: https://github.com/kubernetes/kubernetes/issues/125614):
INFO 2024-07-23T05:26:40.330372Z "HTTP" verb="LIST" URI="/api/v1/pods?limit=500&resourceVersion=0" latency="95.186516ms" userAgent="argocd-application-controller/v0.0.0 (linux/amd64) kubernetes/$Format" audit-ID="976ba7bb-40cc-4c2b-9739-e15c5e78415f" srcIP="10.64.11.10:35741" apf_pl="workload-low" apf_fs="service-accounts" apf_iseats=1 apf_fseats=0 apf_additionalLatency="0s" apf_execution_time="94.722114ms" resp=200
INFO 2024-07-23T05:26:40.516139Z "HTTP" verb="WATCH" URI="/api/v1/pods?allowWatchBookmarks=true&resourceVersion=10636&timeoutSeconds=600&watch=true" latency="1.212642ms" userAgent="argocd-application-controller/v0.0.0 (linux/amd64) kubernetes/$Format" audit-ID="26593f98-914a-4711-8eea-db9f284a8520" srcIP="10.64.11.10:35741" apf_pl="workload-low" apf_fs="service-accounts" apf_iseats=1 apf_fseats=0 apf_additionalLatency="0s" apf_init_latency="529.67µs" apf_execution_time="533.174µs" resp=0
INFO 2024-07-23T05:36:40.518449Z "HTTP" verb="WATCH" URI="/api/v1/pods?allowWatchBookmarks=true&resourceVersion=17070&timeoutSeconds=600&watch=true" latency="1.104058ms" userAgent="argocd-application-controller/v0.0.0 (linux/amd64) kubernetes/$Format" audit-ID="6e495250-eb2b-4af4-bb09-d999197d7e73" srcIP="10.64.11.10:35741" apf_pl="workload-low" apf_fs="service-accounts" apf_iseats=1 apf_fseats=0 apf_additionalLatency="0s" apf_init_latency="531.187µs" apf_execution_time="532.709µs" resp=0
INFO 2024-07-23T05:46:40.522146Z "HTTP" verb="WATCH" URI="/api/v1/pods?allowWatchBookmarks=true&resourceVersion=23542&timeoutSeconds=600&watch=true" latency="988.866µs" userAgent="argocd-application-controller/v0.0.0 (linux/amd64) kubernetes/$Format" audit-ID="05329b6e-8bea-44c0-b22a-42b7ce65b796" srcIP="10.64.11.10:35741" apf_pl="workload-low" apf_fs="service-accounts" apf_iseats=1 apf_fseats=0 apf_additionalLatency="0s" apf_init_latency="499.414µs" apf_execution_time="500.85µs" resp=0
INFO 2024-07-23T05:56:40.524950Z "HTTP" verb="WATCH" URI="/api/v1/pods?allowWatchBookmarks=true&resourceVersion=29971&timeoutSeconds=600&watch=true" latency="995.693µs" userAgent="argocd-application-controller/v0.0.0 (linux/amd64) kubernetes/$Format" audit-ID="fe66a06c-1329-4739-90f0-f0cdd4dd821d" srcIP="10.64.11.10:35741" apf_pl="workload-low" apf_fs="service-accounts" apf_iseats=1 apf_fseats=0 apf_additionalLatency="0s" apf_init_latency="455.428µs" apf_execution_time="456.954µs" resp=0
@crenshaw-dev thanks for the review - I'll respond to the comments here later on.
FWIW as agreed offline during yesterday's sync, I split the fix into two PRs - this one would just drop unnecessary listing after watch expiry and https://github.com/argoproj/gitops-engine/pull/617 would make the list API calls target the watch cache instead of etcd.
I guess we can proceed with the latter one.
Quality Gate passed
Issues
0 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
This LGTM from k8s perspective.
The gitops-engine repository is migrating to https://github.com/argoproj/argo-cd. We are closing all draft PRs. Please feel free to submit it again once the migration is over.