stackdriver_exporter icon indicating copy to clipboard operation
stackdriver_exporter copied to clipboard

Memory Store Metrics Extraction Error

Open niravshah2705 opened this issue 7 years ago • 5 comments

Start command: ./stackdriver_exporter --google.project-id $project --web.listen-address=:9259 --monitoring.metrics-type-prefixes redis.googleapis.com &

Error while querying redis/memorystore exporter:

[root@prometheus-vm04 stackdriver_exporter-0.5.1.linux-amd64]# curl localhost:9259/metrics ERRO[0011] Error retrieving Time Series metrics for descriptor redis.googleapis.com/commands/calls: googleapi: Error 500: An internal error occurred., backendError source="monitoring_collector.go:176" ERRO[0011] Error while getting Google Stackdriver Monitoring metrics: googleapi: Error 500: An internal error occurred., backendError source="monitoring_collector.go:132" An error has occurred during metrics collection:

4 error(s) occurred:

  • collected metric stackdriver_redis_instance_redis_googleapis_com_stats_cpu_utilization label:<name:"instance_id" value:"projects/vuclip-ubs-poc/locations/us-central1/instances/ubs-poc-redis" > label:<name:"node_id" value:"v1-usce1-ubs-poc-redis-736-0000" > label:<name:"project_id" value:"vuclip-ubs-poc" > label:<name:"region" value:"us-central1" > label:<name:"unit" value:"Cycles" > gauge:<value:0.08999999999991815 > was collected before with the same name and label values
  • collected metric stackdriver_redis_instance_redis_googleapis_com_stats_cpu_utilization label:<name:"instance_id" value:"projects/vuclip-ubs-poc/locations/us-central1/instances/ubs-poc-redis" > label:<name:"node_id" value:"v1-usce1-ubs-poc-redis-736-0000" > label:<name:"project_id" value:"vuclip-ubs-poc" > label:<name:"region" value:"us-central1" > label:<name:"unit" value:"Cycles" > gauge:<value:0 > was collected before with the same name and label values
  • collected metric stackdriver_redis_instance_redis_googleapis_com_stats_cpu_utilization label:<name:"instance_id" value:"projects/vuclip-ubs-poc/locations/us-central1/instances/ubs-poc-redis" > label:<name:"node_id" value:"v1-usce1-ubs-poc-redis-736-0000" > label:<name:"project_id" value:"vuclip-ubs-poc" > label:<name:"region" value:"us-central1" > label:<name:"unit" value:"Cycles" > gauge:<value:0.049999999999954525 > was collected before with the same name and label values
  • collected metric stackdriver_redis_instance_redis_googleapis_com_stats_network_traffic label:<name:"instance_id" value:"projects/vuclip-ubs-poc/locations/us-central1/instances/ubs-poc-redis" > label:<name:"node_id" value:"v1-usce1-ubs-poc-redis-736-0000" > label:<name:"project_id" value:"vuclip-ubs-poc" > label:<name:"region" value:"us-central1" > label:<name:"unit" value:"By" > gauge:<value:290193 > was collected before with the same name and label values

niravshah2705 avatar Sep 21 '18 04:09 niravshah2705

The reason for this behavior is that the Stackdriver Monitoring API V3 emits 4 values for the metric type redis.googleapis.com/stats/cpu_utilization per node. These probably correspond to the 4 different CPU cores Redis runs on, but an identifier for the core is not emitted as a metric label. Prometheus however expects, that the metric name and label key-value pairs are unique per scrape, which is not the case here.

Possible workaround are:

  • Computing an average value per core
  • Dropping any but one value for all duplicate metrics
  • Adding a random label for duplicate metrics

I implemented the second workaround (because it was the simplest) in my fork but I don't consider this a reasonable fix for the issue, see this commit: https://github.com/janhicken/stackdriver_exporter/commit/ec612cc0d1ab8c03a29812cd662ee9af5a995617

janhicken avatar Nov 21 '18 13:11 janhicken

There are known issues with the memorystore metrics. Apparently there's multiple internal bugs and one of them is filed https://issuetracker.google.com/issues/119642989 I opened tickets and the support rep said they know about the issues and they're working on it.

jameshartig avatar Jan 18 '19 02:01 jameshartig

I'm not sure if I'm seeing the same issue as here or in #48, but definitely one of them with cloudtasks as prefix. FWIW, I also seem to get duplicate rows from the metrics explorer. I haven't dug into the API response yet.

Stelminator avatar Apr 22 '19 23:04 Stelminator

Any updates for this, https://issuetracker.google.com/issues/119642989 seems be fixed now.

axot avatar Aug 27 '19 12:08 axot

The API is still missing the labels. I've opened a new issue (as requested by the bot): https://issuetracker.google.com/issues/143057656

I do see the same error with redis.googleapis.com/stats/memory/usage_ratio although there is no additional label mentioned in the docs (https://cloud.google.com/monitoring/api/metrics_gcp#gcp-redis)

jayme-github avatar Oct 21 '19 13:10 jayme-github