stackdriver_exporter icon indicating copy to clipboard operation
stackdriver_exporter copied to clipboard

Fetch all descriptors in one request and filter in code

Open fredr opened this issue 4 years ago • 7 comments

The GCP pricing model for these APIs is to charge per request, so this patch would slightly decrease the number of requests in every scrape, and therefore reduce the cost a bit.

Preferably we could get all time series that we are interested in in one request, but unfortunately the API wont let us get time series from several descriptors in one request.

The default/max page size in not documented for this API, but it does return all of our descriptors in one page (over 4000 descriptors).

fredr avatar Jun 17 '21 11:06 fredr

How can we progress this PR?

weyert avatar Aug 02 '21 14:08 weyert

Hmm, I think things were done the current way in order to fetch large amounts of metrics faster, by executing them in parallel.

We needed to do this in order to reduce scrape intervals.

Have you tested how this impacts performance?

SuperQ avatar Jan 31 '22 14:01 SuperQ

Hmm, I think things were done the current way in order to fetch large amounts of metrics faster, by executing them in parallel.

The metrics should still be fetched in parallel here: https://github.com/prometheus-community/stackdriver_exporter/pull/125/files#diff-b6f1dd37640aa1bdb3aff2c484e46fc67158d22598dca934306fa23185917da5R234

So I think this shouldn't affect the parallelism, if I remember correctly :thinking:

Have you tested how this impacts performance?

The employer I worked for have been running this since I wrote it, with no observed performance impact.

fredr avatar Jan 31 '22 16:01 fredr

Sadly, I don't have a big stackdriver account to test this on anymore. Maybe @igorw can help with this?

SuperQ avatar Feb 01 '22 10:02 SuperQ

@SuperQ recruited me to test this in a bit larger setup. The PR was deployed around 15:10 in the screenshot. The 2 hours before that is a bit off, due to difficulties during deployment. scrape_duration_seconds

the-it avatar Apr 26 '22 09:04 the-it

Yes, that's a pretty big slowdown in colection. Perhaps we should instruement these phases of the collection so we know exactly how long it takes.

SuperQ avatar Apr 28 '22 08:04 SuperQ

@the-it Just out of curiosity, do you know around what number of descriptors you are pulling?

I unfortunately don't manage a stackdriver setup any longer, so can't really investigate into this. IIRC this fix reduced the price a bit, but the bulk of the requests comes from fetching the metrics, as that is one request per metric, so in the grand scheme of things it might not be worth it if you see a performance penalty.

fredr avatar Apr 28 '22 11:04 fredr

@

Hmm, I think things were done the current way in order to fetch large amounts of metrics faster, by executing them in parallel.

using stackdriver_exporter on 50+ projects I get throttled by GCP (max 6000rpm allowed per service account) just after ONE request to /metrics endpoint.

@SuperQ Is it possible to add some --parallel=X parameter?

WojciechKuk avatar Dec 27 '22 16:12 WojciechKuk

Maybe this is enough: https://github.com/prometheus-community/stackdriver_exporter/pull/218

SuperQ avatar May 25 '23 18:05 SuperQ

Maybe this is enough: #218

Nice! I'll close this one in favor of that one!

fredr avatar May 26 '23 12:05 fredr