stackdriver_exporter Inconsistent metrics from storage.googleapis.com

Hello,

I encounter some issue to grab the storage metrics. Some time I've got all the metrics, some time just some of them and some times none.

I've started it with just storage and verbose output

STACKDRIVER_EXPORTER_MONITORING_METRICS_TYPE_PREFIXES=storage.googleapis.com/

And observed the following logs:

testsd_1  | time="2018-11-15T08:27:11Z" level=debug msg="Listing Google Stackdriver Monitoring metric descriptors starting with `storage.googleapis.com/`..." source="monitoring_collector.go:213"
testsd_1  | time="2018-11-15T08:27:11Z" level=debug msg="Retrieving Google Stackdriver Monitoring metrics for descriptor `storage.googleapis.com/network/received_bytes_count`..." source="monitoring_collector.go:169"
testsd_1  | time="2018-11-15T08:27:11Z" level=debug msg="Retrieving Google Stackdriver Monitoring metrics for descriptor `storage.googleapis.com/network/sent_bytes_count`..." source="monitoring_collector.go:169"
testsd_1  | time="2018-11-15T08:27:11Z" level=debug msg="Retrieving Google Stackdriver Monitoring metrics for descriptor `storage.googleapis.com/storage/object_count`..." source="monitoring_collector.go:169"
testsd_1  | time="2018-11-15T08:27:11Z" level=debug msg="Retrieving Google Stackdriver Monitoring metrics for descriptor `storage.googleapis.com/api/request_count`..." source="monitoring_collector.go:169"
testsd_1  | time="2018-11-15T08:27:11Z" level=debug msg="Retrieving Google Stackdriver Monitoring metrics for descriptor `storage.googleapis.com/storage/total_byte_seconds`..." source="monitoring_collector.go:169"
testsd_1  | time="2018-11-15T08:27:11Z" level=debug msg="Retrieving Google Stackdriver Monitoring metrics for descriptor `storage.googleapis.com/storage/total_bytes`..." source="monitoring_collector.go:169"

But only one metric is returned

# curl -vs localhost:9256/metrics 2>&1 | grep gcs_bucket | sed 's/{.*/{.../'
# HELP stackdriver_gcs_bucket_storage_googleapis_com_storage_object_count Total number of objects per bucket, grouped by storage class. Values are measured once per day.
# TYPE stackdriver_gcs_bucket_storage_googleapis_com_storage_object_count gauge
stackdriver_gcs_bucket_storage_googleapis_com_storage_object_count{...
stackdriver_gcs_bucket_storage_googleapis_com_storage_object_count{...
stackdriver_gcs_bucket_storage_googleapis_com_storage_object_count{...
stackdriver_gcs_bucket_storage_googleapis_com_storage_object_count{...

Or two

# curl -vs localhost:9256/metrics 2>&1 | grep gcs_bucket | sed 's/{.*/{.../'
# HELP stackdriver_gcs_bucket_storage_googleapis_com_storage_object_count Total number of objects per bucket, grouped by storage class. Values are measured once per day.
# TYPE stackdriver_gcs_bucket_storage_googleapis_com_storage_object_count gauge
stackdriver_gcs_bucket_storage_googleapis_com_storage_object_count{...
stackdriver_gcs_bucket_storage_googleapis_com_storage_object_count{...
stackdriver_gcs_bucket_storage_googleapis_com_storage_object_count{...
stackdriver_gcs_bucket_storage_googleapis_com_storage_object_count{...
# HELP stackdriver_gcs_bucket_storage_googleapis_com_storage_total_byte_seconds Total daily storage in byte*seconds used by the bucket, grouped by storage class. * Connection #0 to host localhost left intact
# TYPE stackdriver_gcs_bucket_storage_googleapis_com_storage_total_byte_seconds gauge
stackdriver_gcs_bucket_storage_googleapis_com_storage_total_byte_seconds{...
stackdriver_gcs_bucket_storage_googleapis_com_storage_total_byte_seconds{...
stackdriver_gcs_bucket_storage_googleapis_com_storage_total_byte_seconds{...
stackdriver_gcs_bucket_storage_googleapis_com_storage_total_byte_seconds{...

Any idea why all metrics (like stackdriver_gcs_bucket_storage_googleapis_com_storage_total_bytes) are not returned each time ?

Thanks.

Nov 15 '18 08:11 arcenik

We are seeing this too with stackdriver_gcs_bucket_storage_googleapis_com_storage_total_bytes being returned inconsistently. It appears every 5 minutes then disappears again.

Could this be some sort of API limit and the exporter is just not caching the results?

Dec 14 '18 14:12 ipstatic

Ahh straight from https://cloud.google.com/monitoring/api/metrics_gcp#gcp-storage

Total size of all objects in the bucket, grouped by storage class. Values are measured once per day. Sampled every 300 seconds. After sampling, data is not visible for up to 600 seconds.

So we need to cache this result as it will not be available for up to 10 minutes.

Dec 14 '18 15:12 ipstatic

Hey, I'm facing exactly the same problem. It also seems that the sampling doesn't work. I see changes in storage_total_bytes only once a day. I also don't understand the documentation of stackdriver. For example, the docs about storage say:

storage/total_bytes: Total size of all objects in the bucket, grouped by storage class.
Values are measured once per day. Sampled every 300 seconds. After sampling, data
is not visible for up to 600 seconds.

What exactly does this mean? I interpret this as:

once a day, the actual values are calculated
but every 300 seconds, there is a sampling going on, to give an educated guess about the total_bytes

Now please see my stackdriver console output below. You see the past 7 days and the graph is only updated once a day. I don't see the sampling data changing the graphs at all. This basically makes the metrics useless (since they are at max 24h old).

storage-values

Jan 17 '19 07:01 rogierlommers

Hi @frodenas ,

I'm also having the same issue on another metric.

Found this documentation regarding the "not visible for up" https://cloud.google.com/monitoring/api/metrics#metadata

My interpretation of the "Sampled every 300 seconds." is that every 300 seconds google goes into the bucket and counts the number of objects. It then sends the count to some storage. The "After sampling, data is not visible for up to 600 seconds." is the time the count takes to be consistent on the storage. Only after is being consistent can we read it using the api.

This has a big impact on metric collection, as to get a non empty value, we should be able to configure some delay on the metric date time.

i.g. for object count, when it stackdriver exporter collection runs, it should fetch the metric value for at least 600 seconds ago, if not it will get empty values as is not yet visible because the 600 seconds haven't passed yet.

Regards

Sep 13 '19 10:09 jrluis

Stackdriver exporter already support the option to set an offset on the metrics date.

See: https://github.com/frodenas/stackdriver_exporter/blob/master/collectors/monitoring_collector.go#L179

Tracing back the code, it can be configured at the cli with the switch: --monitoring.metrics-offset=610s

Sep 13 '19 10:09 jrluis

stackdriver_exporter stackdriver_exporter copied to clipboard

Inconsistent metrics from storage.googleapis.com

stackdriver_exporter
stackdriver_exporter copied to clipboard