stackdriver_exporter
stackdriver_exporter copied to clipboard
Aggregate-deltas incompatible with `?collect` parameter in v0.15, and completely broken on `master`
My latest change (https://github.com/prometheus-community/stackdriver_exporter/commit/bc18b73dab0e6454285d29150a7bcd99578da8d5) looks to have broken the --monitoring.aggregate-deltas feature 😞 My apologies! 🤦🏻
On v0.15.0, newHandler is called once on start, which calls innerHandler once, which in turn creates the InMemoryCounterStore and InMemoryHistogramStore for delta metrics aggregation. innerHandler then returns an http.Handler for ServeHTTP to call upon each request to /metrics.
The exception to the above (in v0.15.0) is when call-time metric filters are supplied via ?collect=. In that case innerHandler is invoked on every request, and delta metrics aggregation does not work.
Unfortunately, in my change, innerHandler is now invoked on every call to /metrics (similar to how on v0.15.0, it gets invoked on every call if there are call-time metric filters).
Steps to reproduce
On v0.15.0, delta aggregation works
docker run prometheuscommunity/stackdriver-exporter-linux-amd64:v0.15.0 \
--monitoring.aggregate-deltas \
--monitoring.aggregate-deltas-ttl=30m \
--google.project-id=my-project \
--monitoring.metrics-type-prefixes="cloudsql.googleapis.com/database/cpu/usage_time" # DELTA type
# counter is monotonically increasing
❯ curl -s http://localhost:9255/metrics | grep '^stackdriver_.*_cpu_usage_time'
stackdriver_cloudsql_database_cloudsql_googleapis_com_database_cpu_usage_time{database_id="my-project:mydb-master-6452f9dc",project_id="my-project",region="us-central",unit="s{CPU}"} 47.54024539538659 1710348060000
❯ curl -s http://localhost:9255/metrics | grep '^stackdriver_.*_cpu_usage_time'
stackdriver_cloudsql_database_cloudsql_googleapis_com_database_cpu_usage_time{database_id="my-project:mydb-master-6452f9dc",project_id="my-project",region="us-central",unit="s{CPU}"} 72.79858832078753 1710348060000
❯ curl -s http://localhost:9255/metrics | grep '^stackdriver_.*_cpu_usage_time'
stackdriver_cloudsql_database_cloudsql_googleapis_com_database_cpu_usage_time{database_id="my-project:mydb-master-6452f9dc",project_id="my-project",region="us-central",unit="s{CPU}"} 97.56146611412987 1710348240000
But not if ?collect= is used.
# counter has reset
❯ curl -s 'http://localhost:9255/metrics?collect=cloudsql.googleapis.com/database/cpu/usage_time' | grep '^stackdriver_.*_cpu_usage_time'
stackdriver_cloudsql_database_cloudsql_googleapis_com_database_cpu_usage_time{database_id="my-project:mydb-master-6452f9dc",project_id="my-project",region="us-central",unit="s{CPU}"} 25.679574115085416 1710348420000
On master, delta aggregation doesn't work at all
docker run prometheuscommunity/stackdriver-exporter-linux-amd64:v0.15.0 \
--monitoring.aggregate-deltas \
--monitoring.aggregate-deltas-ttl=30m \
--google.project-id=my-project \
--monitoring.metrics-type-prefixes="cloudsql.googleapis.com/database/cpu/usage_time" # DELTA type
# counter fluctuates as a gauge, even though --monitoring.aggregate-deltas is enabled
❯ curl -s 'http://localhost:9255/metrics' | grep '^stackdriver_.*_cpu_usage_time'
stackdriver_cloudsql_database_cloudsql_googleapis_com_database_cpu_usage_time{database_id="my-project:mydb-master-6452f9dc",project_id="my-project",region="us-central",unit="s{CPU}"} 25.63321346603334 1710348600000
❯ curl -s 'http://localhost:9255/metrics' | grep '^stackdriver_.*_cpu_usage_time'
stackdriver_cloudsql_database_cloudsql_googleapis_com_database_cpu_usage_time{database_id="my-project:mydb-master-6452f9dc",project_id="my-project",region="us-central",unit="s{CPU}"} 26.545367503887974 1710348660000
❯ curl -s 'http://localhost:9255/metrics' | grep '^stackdriver_.*_cpu_usage_time'
stackdriver_cloudsql_database_cloudsql_googleapis_com_database_cpu_usage_time{database_id="my-project:mydb-master-6452f9dc",project_id="my-project",region="us-central",unit="s{CPU}"} 23.90490857156692 1710348720000