argocd-extension-metrics
argocd-extension-metrics copied to clipboard
Some graphs randomly failed with error 400
I have a weird issue with some graph that do not show (see screenshot).
when troubleshooting, I noticed that metrics-server returns error 400 while Prometheus (Mimir) returns 200:
2024-05-13 12:00:44.067 | [GIN] 2024/05/13 - 10:00:44 \| 400 \| 29.890186ms \| 10.20.0.132 \| GET "/api/applications/argocd/groupkinds/deployment/rows/pod/graphs/pod_memory_pie?name=argocd-metrics-server.*&namespace=argocd&application_name=argocd&project=eksstack&uid=1a966c99-85ed-4b0d-a3c9-1a9103a56836&duration=1h"
-- | --
ts=2024-05-13T10:00:44.068005462Z caller=handler.go:372 level=info user=anonymous msg="query stats" component=query-frontend method=POST path=/prometheus/api/v1/query_range user_agent=Go-http-client/1.1 status_code=200 response_time=17.538713ms response_size_bytes=2276 query_wall_time_seconds=0.010108191 fetched_series_count=3 fetched_chunk_bytes=606 fetched_chunks_count=4 fetched_index_bytes=0 sharded_queries=0 split_queries=1 estimated_series_count=4 queue_time_seconds=2.0411e-05 param_start=2024-05-13T09:00:44.036Z param_step=60000 param_end=2024-05-13T10:00:44.036Z param_query="sum(rate(container_memory_usage_bytes{pod=~\"argocd-metrics-server.*\", container!=\"POD\", image!=\"\", container!=\"\", container_name!=\"POD\"}[5m])) by (pod)" length=1h5m0s time_since_min_time=1h5m0.014401369s time_since_max_time=14.401369ms results_cache_hit_bytes=0 results_cache_miss_bytes=1243 status=success