prometheus-to-cloudwatch icon indicating copy to clipboard operation
prometheus-to-cloudwatch copied to clipboard

Is there a limit to the number of metrics I can push?

Open ashish235 opened this issue 6 years ago • 8 comments

Hi guys,

First of all great work, I really like this project.

My Prometheus is scarping a lot of metrics from different exporters however your tool is just sending 84 of them to Cloudwatch. Is there a limit or something or a config maybe?

From the logs I can't see much, maybe a way to enable debug?

2018/09/17 05:26:30 prometheus-to-cloudwatch: published 84 metrics to CloudWatch 2018/09/17 05:27:30 prometheus-to-cloudwatch: published 84 metrics to CloudWatch 2018/09/17 05:28:30 prometheus-to-cloudwatch: published 84 metrics to CloudWatch

The thing I noticed is the metrics which are being pushed anyway are from the local host of the system where Prom is running itself nothing from the other nodes from where I 'm exporting metrics using node exporter and cadvisor mainly.

Can you guys please help me out here?

Regards, Ashish

ashish235 avatar Sep 17 '18 05:09 ashish235

Thanks @ashish235 !

@aknysh do you know if there are any inherent limits?

osterman avatar Sep 17 '18 22:09 osterman

@ashish235 @osterman sorry for the delay, let me take a look at that

aknysh avatar Sep 20 '18 13:09 aknysh

@ashish235 the module was designed to scrape just one Prometheus endpoint, which you specify in the prometheus_scrape_url input parameter. If Prometheus running behind that URL can collect metrics from other nodes, then they will be scraped by the module. we will not be able to add more features to this module at this time (@osterman ?), but you can take a look at other (official) solutions (it's a recommended approach), for example:

  • https://github.com/prometheus/cloudwatch_exporter
  • https://groups.google.com/forum/#!topic/prometheus-developers/3n7n0PGG7Vw
  • https://medium.com/@griggheo/initial-experiences-with-the-prometheus-monitoring-system-167054ac439c

thank you for testing the module.

aknysh avatar Sep 20 '18 14:09 aknysh

@aknysh I do not think he is asking for more features. I think what he is implying is that he has more than 84 metrics returned by an endpoint but it is always truncated to 84. Is there some kind of limit that you know about? E.g. maximum batch size?

@aknysh is correct that we do not have any plans to invest more into this module at this time.

osterman avatar Sep 20 '18 15:09 osterman

hmm, we don't have any limits in the module, except that CloudWatch API has some limits described here https://github.com/cloudposse/prometheus-to-cloudwatch/blob/master/prometheus_to_cloudwatch.go#L167

@ashish235 as I understand, the module pushes all 84 metrics collected locally, but not from the other nodes from where I 'm exporting metrics using node exporter and cadvisor mainly is it correct?

aknysh avatar Sep 20 '18 17:09 aknysh

Aha, the "Max 40kb request size" seems like a good place to start. @ashish235 if you are able to submit a PR, we'll promptly review it. Thanks!

osterman avatar Sep 20 '18 18:09 osterman

@osterman @aknysh

Yeah, actually I saw 2 more limits today.

I pointed to a diff Prometheus scrap URL it was of an Cadvisor. So after that, I could see about 3400 metrics on that node (on the :8080/metrics page), so I tried to limit the scope of cadvisor (scraping limited metrics) the number of metrics was still about 540 but in the logs, it was always 84 metrics pushed. The other limits are as below.

2018/09/20 19:01:50 prometheus-to-cloudwatch: error publishing to CloudWatch: RequestEntityTooLarge: Request size 47335 exceeded 40960 bytes status code: 413, request id: a156ef5a-bd07-11e8-848a-3fb0164eca3b 2018/09/20 19:01:50 prometheus-to-cloudwatch: error publishing to CloudWatch: InvalidParameterValue: The collection MetricData must not have a size greater than 20. status code: 400, request id: a15ac04e-bd07-11e8-8e88-13873a57bca9

@osterman sorry mate, I 'm not a developer, won't be able to help with code. :(

ashish235 avatar Sep 20 '18 19:09 ashish235

@ashish235 well, at least we've identified the problem. looks like large payloads need to be split up and batched into smaller requests.


we do offer commercial support on all of our Open Source, so reach out to us at [email protected], if this is urgent. Either way, thanks for reporting the issue. If we have downtime in the coming months, we may get around to it - but I cannot provide an ETA.

osterman avatar Sep 20 '18 19:09 osterman