client_golang icon indicating copy to clipboard operation
client_golang copied to clipboard

Propagate context.Context through Gather and Collect

Open ringerc opened this issue 1 year ago • 8 comments
trafficstars

I propose new variants of Gatherer and Collector that support golang's context.Context propagation through the request chain to provide a means of passing timeout and request information through to collectors.

This would greatly enhance the capabilities of promhttp middleware and make it much easier (and safer) to handle client-supplied timeout headers, query-params passed to a scrape endpoint, etc.

For example, a metrics endpoint that wishes to support query parameters for selective scraping currently has a difficult time capturing this and passing this through the layers of indirection. It's difficult to pass contextual info through the promhttp.Handler -> prometheus.Gatherer (possibly a Registry) -> prometheus.Collector chain. Especially since promhttp.HandlerForTransactional and the rest of promhttp offers few points of extension. This can be worked around by having an application's generic http handler create a new Prometheus handler, registry, etc in a closure for each request then delegate to it, but that can be inefficient and it loses cross-scrape-persistent instrumentation state.

If a context.Context was propagated from the promhttp.Handler through the layers, applications could pass relevant details transparently through Prometheus without exposing protocol-specifics etc to the Prometheus APIs.

I'm sure there are less intrusive and smarter ways to do this than I can come up with, which is:

  • Add variants of prometheus.Gatherer, prometheus.Gatherers, prometheus.Collector that accept a context.Context argument to Gather(...), Collect(...) etc. Expose Registry API to accept them. Methods aren't added to the existing interfaces as that would break API compatibility with existing implementations.
  • In promhttp.HandlerForTransactional, read the context.Context from the http.Request with Request.Context()
  • Propagate it through HandlerForTransactional's call Gather() as e.g. GatherWithContext(ctx)
  • Propagate it through the context-aware Gatherer as Collect(ctx) calls on the Collector(s)
  • The prometheus.CollectorWithContext (or whatever) implementations accept Collect(ctx context.context, ch chan<- prometheus.Metric) so they can use ctx.Value(...) to access contextual info, check a propagated timeout or cancellation function, etc.

This would also allow the documented limitations around timeout handing in promhttp to be fixed, because a context.WithTimeout(...) could be passed. promhttp could also automatically handle any client-supplied X-Prometheus-Scrape-Timeout-Seconds header; currently it has no way to propagate this knowledge down through the gatherer to the collectors.

It'd also be convenient to have a promhttp.WithContextFunc(...) that accepts a func(r *http.Request) context.Context argument and returns an Option. Akin to WithLabelFromCtx. So the app doesn't have to implement its own http.Handler to delegate to Prometheus's, it can use a Prometheus-supplied wrapper.

Maybe there's a sensible way to do this now. Or maybe it's reasonable to just make a new registry etc for each scrape, and just ensure that any more expensive application state is well separated from the prometheus API implementations so the app's Registry, its Collector instances etc are one-shot and request-specific. But that doesn't work well with the instrumentation wrappers in promhttp, and it's hard to see what would.

ringerc avatar Jun 18 '24 02:06 ringerc

Thanks! Sounds like the primary use case is to implement client side filtering (similar to https://github.com/prometheus/client_golang/issues/135 discussion) as well as client side timeout provided via query parameter.

Valid use cases, but I wonder, do you see a way to implement those in explicit way (param support or filtering or passing more options). We think it will be more useful to others, plus we avoid controversial passing everything through context values. 🤔

Let's discuss more!

WDYT?

bwplotka avatar Jul 08 '24 14:07 bwplotka

Another use case:

TLDR: We store our logger instance in the context. So current interface makes it impossible to log anything for us from the Gatherer. We would need to create an instance per request.

We use zap logger which we modify in auth middleware to inject user and request info. Later in the chain we get the logger from the context. In addition to the logger we may want to access the current authenticated user from the context.

zaynetro avatar Jul 16 '24 15:07 zaynetro

Direct support for parameters would be a big help, to be sure.

But there are still plenty of other cases where passing a context is valuable. Context passing is valuable for things like deadline or timeout contexts and tons of middleware uses. It's fundamental to how Go decouples orthogonal concerns, and should be supported irrespective of whether a more convenient interface to query parameters is supplied.

Context passing is needed for things like:

  • Tracing stack flags for trace context, trace filtering
  • Log context propagation
  • Various middleware tools

The promhttp docs themselves warn that timeout handling is broken. This is due to the lack of context-passing support.

ringerc avatar Jul 28 '24 22:07 ringerc

Cool, thanks.

I think we could try proposing GathererWithContextand check if the implementation has it in various places.. and see how it goes. Then perhaps deprecate old one even - however I would try to limit the initial scope to minimum, starting from promhttp.

bwplotka avatar Jul 29 '24 16:07 bwplotka

This would be very useful for us. I see there hasn't been any activity on this since July, is this work still being planned?

epsteina16 avatar Nov 07 '24 13:11 epsteina16

I ran out of time to work on it, but hope to return to it.

The approach I propose here is quite intrusive, and requires a lot of duplicate API, since golang doesn't support optional methods on interfaces, default implementations for interface methods, or any other way to extend an interface that retains compatibility with existing implementers. I had hoped to figure out something better, but so far haven't been able to.

When my current priority project is done, I'm hoping to come back to this, but would welcome input from anyone else with similar needs in the meantime.

ringerc avatar Nov 07 '24 21:11 ringerc

Yea, agree context would be nice. The selection of metrics or labels to use is something that we ideally have natively, but it's unclear what exactly selection mechanisms would be better in the ecosystem (matchers on any labels? on just metric name? Only filtering names?). For this reason starting with context first would be one way.

Maybe it's time for some prometheusx or promx or promv2 package that will provide context enabled interfaces 🤔

bwplotka avatar Nov 07 '24 22:11 bwplotka

Given how invasive the changes needed are, does it make sense to pass around a ScrapeContext structure (needs a better name) that contains the context.Context instead of passing around context.Context directy, so that other things can be added more easily in the future? I don't have anything specific in mind ATM, but I think blackbox_exporter looks at query strings, for example.

mharbison72 avatar Nov 08 '24 05:11 mharbison72

A promhttp/v2 would make sense for this, yes.

ringerc avatar Jul 17 '25 05:07 ringerc

I ran into this today while working on a custom exporter. Would love to be able to have a clean OpenTelemetry trace that includes the Collect() performed by the exporter

chadxz avatar Oct 30 '25 19:10 chadxz