yet-another-cloudwatch-exporter [FEATURE] Support to expose historical data points

[FEATURE] Support to expose historical data points

Open luismy opened this issue 1 year ago • 6 comments

Is there an existing issue for this?

[X] I have searched the existing issues

Feature description

Context

In the context of jobs, Yace allows scraping CloudWatch metrics, and the frequency and amount of data points are controlled by:

period: <int>: statistic period in seconds
length: <int>: how far back to request data in seconds

All the examples in the examples folder, show how to scrape different types of metrics and the period and length happen to be always the same. I'm currently working with a client who adopted YACE to scrape metrics and we configured it to get data points with a granularity of 1 minute and 5 minutes worth of data.

Take the following yace-config.yaml as a concrete example:

apiVersion: v1alpha1
sts-region: eu-west-2
discovery:
  jobs:
  - type: alb
    regions:
    - eu-west-2
    period: 60
    length: 300
    addCloudwatchTimestamp: true
    dimensionNameRequirements:
    - LoadBalancer
    metrics:
    - name: TargetResponseTime
      statistics: [Average]

Expected Behaviour

We run YACE, via docker-compose, using the configuration above and we were expecting to see the following:

As YACE is by default configured to scrape every 5 minutes the metrics in Prometheus were delayed by 5 minutes AS EXPECTED
Prometheus was configured to scrape YACE every minute (again default configuration), so we were expecting to see 5 datapoints in Prometheus shortly after YACE made the datapoints available in the /metrics endpoint. However, only the last data point was presented NOT EXPECTED

I spent a fair amount of time trying to understand if we misconfigured YACE but after enabling the debug logs, spending a fair amount of time looking at the source code, and running YACE locally I realised that's how is currently implemented. See https://github.com/nerdswords/yet-another-cloudwatch-exporter/blob/c7807a770bb427f8ddb2c7becac51185fb3e8230/pkg/clients/cloudwatch/v1/client.go#L120-L128

I believe this is NOT a BUG but a design decision to only include the latest data points and not any historic data as Prometheus will reject any samples that are too old.

Having said that, I believe that it might be really useful to be able to include historic data points if the period is small and the length does not go too far (making sure Prometheus is still happily ingesting all the samples)

New Feature

Implementing the approach above could be a bit controversial, and it will introduce breaking changes (or extra behaviour when the length is bigger than the period setting). Therefore, I'm proposing to introduce a new feature, e.g. -enable-feature=allow-multiple-datapoints or -enable-feature=allow-historical-datapoints whereby all the data points are included. So, in the example above, rather than getting the latest data point, the metrics page will include the 5 fetched from the CloudWatch API

What might the configuration look like?

No extra configuration would be required as I'm suggesting controlling this new behaviour globally via a new feature flag but I'm open to suggestions as it could be useful to have a fine-grain control by including an extra config setting at the job level

Anything else?

I'll include more details in terms of images and logs later on but I'm happy to give it a go, although it's been a while since I've done any serious development and I don't have huge experience with golang 😅

Jun 10 '23 16:06 luismy

yet-another-cloudwatch-exporter yet-another-cloudwatch-exporter copied to clipboard

[FEATURE] Support to expose historical data points

Is there an existing issue for this?

Feature description

Context

Expected Behaviour

New Feature

What might the configuration look like?

Anything else?

yet-another-cloudwatch-exporter
yet-another-cloudwatch-exporter copied to clipboard