stackdriver_exporter
stackdriver_exporter copied to clipboard
Memory Store Metrics Extraction Error
Start command: ./stackdriver_exporter --google.project-id $project --web.listen-address=:9259 --monitoring.metrics-type-prefixes redis.googleapis.com &
Error while querying redis/memorystore exporter:
[root@prometheus-vm04 stackdriver_exporter-0.5.1.linux-amd64]# curl localhost:9259/metrics
ERRO[0011] Error retrieving Time Series metrics for descriptor redis.googleapis.com/commands/calls: googleapi: Error 500: An internal error occurred., backendError source="monitoring_collector.go:176"
ERRO[0011] Error while getting Google Stackdriver Monitoring metrics: googleapi: Error 500: An internal error occurred., backendError source="monitoring_collector.go:132"
An error has occurred during metrics collection:
4 error(s) occurred:
- collected metric stackdriver_redis_instance_redis_googleapis_com_stats_cpu_utilization label:<name:"instance_id" value:"projects/vuclip-ubs-poc/locations/us-central1/instances/ubs-poc-redis" > label:<name:"node_id" value:"v1-usce1-ubs-poc-redis-736-0000" > label:<name:"project_id" value:"vuclip-ubs-poc" > label:<name:"region" value:"us-central1" > label:<name:"unit" value:"Cycles" > gauge:<value:0.08999999999991815 > was collected before with the same name and label values
- collected metric stackdriver_redis_instance_redis_googleapis_com_stats_cpu_utilization label:<name:"instance_id" value:"projects/vuclip-ubs-poc/locations/us-central1/instances/ubs-poc-redis" > label:<name:"node_id" value:"v1-usce1-ubs-poc-redis-736-0000" > label:<name:"project_id" value:"vuclip-ubs-poc" > label:<name:"region" value:"us-central1" > label:<name:"unit" value:"Cycles" > gauge:<value:0 > was collected before with the same name and label values
- collected metric stackdriver_redis_instance_redis_googleapis_com_stats_cpu_utilization label:<name:"instance_id" value:"projects/vuclip-ubs-poc/locations/us-central1/instances/ubs-poc-redis" > label:<name:"node_id" value:"v1-usce1-ubs-poc-redis-736-0000" > label:<name:"project_id" value:"vuclip-ubs-poc" > label:<name:"region" value:"us-central1" > label:<name:"unit" value:"Cycles" > gauge:<value:0.049999999999954525 > was collected before with the same name and label values
- collected metric stackdriver_redis_instance_redis_googleapis_com_stats_network_traffic label:<name:"instance_id" value:"projects/vuclip-ubs-poc/locations/us-central1/instances/ubs-poc-redis" > label:<name:"node_id" value:"v1-usce1-ubs-poc-redis-736-0000" > label:<name:"project_id" value:"vuclip-ubs-poc" > label:<name:"region" value:"us-central1" > label:<name:"unit" value:"By" > gauge:<value:290193 > was collected before with the same name and label values
The reason for this behavior is that the Stackdriver Monitoring API V3 emits 4 values for the metric type redis.googleapis.com/stats/cpu_utilization per node. These probably correspond to the 4 different CPU cores Redis runs on, but an identifier for the core is not emitted as a metric label. Prometheus however expects, that the metric name and label key-value pairs are unique per scrape, which is not the case here.
Possible workaround are:
- Computing an average value per core
- Dropping any but one value for all duplicate metrics
- Adding a random label for duplicate metrics
I implemented the second workaround (because it was the simplest) in my fork but I don't consider this a reasonable fix for the issue, see this commit: https://github.com/janhicken/stackdriver_exporter/commit/ec612cc0d1ab8c03a29812cd662ee9af5a995617
There are known issues with the memorystore metrics. Apparently there's multiple internal bugs and one of them is filed https://issuetracker.google.com/issues/119642989 I opened tickets and the support rep said they know about the issues and they're working on it.
I'm not sure if I'm seeing the same issue as here or in #48, but definitely one of them with cloudtasks as prefix. FWIW, I also seem to get duplicate rows from the metrics explorer. I haven't dug into the API response yet.
Any updates for this, https://issuetracker.google.com/issues/119642989 seems be fixed now.
The API is still missing the labels. I've opened a new issue (as requested by the bot): https://issuetracker.google.com/issues/143057656
I do see the same error with redis.googleapis.com/stats/memory/usage_ratio although there is no additional label mentioned in the docs (https://cloud.google.com/monitoring/api/metrics_gcp#gcp-redis)