req icon indicating copy to clipboard operation
req copied to clipboard

Telemetry events

Open whatyouhide opened this issue 10 months ago • 6 comments

This issue is a starting point to discuss what the API for Telemetry events (emitted by Req) will look like.

An initial sketch would be to have default events like this, resembling a telemetry:span/3.

  • [:req, :request, :pipeline, :start]
  • [:req, :request, :pipeline, :stop]
  • [:req, :request, :pipeline, :exception]

These would measure the entire request pipeline.

I would love if we could also customize the prefix, something like this:

Req.new(
  # ...,
  telemetry: [
    prefix: [:my_app, :stripe_client]
  ]
)

Resulting in events such as [:my_app, :stripe_client, :pipeline, ...].

Steps

At some point we might want to instrument steps, such as JSON decoding/encoding, but I’m not sure this is critical to get this out the door. Adding Telemetry events is not a breaking change, so this is not a blocker for 1.0.

cc @wojtekmach

whatyouhide avatar Apr 03 '24 10:04 whatyouhide

One important consideration: retries. We should think of how to instrument those, because they should not be instrumented like the whole request pipeline. Otherwise, you might end up with requests that look like they took seconds because they were retried a few times. Just putting this out there.

whatyouhide avatar Apr 04 '24 07:04 whatyouhide

I think you need at least two telemetry IMO,

  • [:req, request, *] - where such telemetry is done by middleware or by ensuring it runs at the very end of the pipeline. It would be the same as https://github.com/elixir-tesla/tesla/blob/master/lib/tesla/middleware/telemetry.ex

  • [:req, :pipeline, *] - it is about the middleware pipeline without including the last step, the request telemetry step that I mentioned before (pending in Tesla, but thought on doing this)

yordis avatar Jun 20 '24 06:06 yordis

May I add: we definitely need to have the url template with path_params as part of the metadata, so e.g. we can get useful metrics time series. I heavily use this in Tesla (with PathParams and KeepRequest middleware).

aloukissas avatar Aug 12 '24 13:08 aloukissas

Agreed, thank you for mentioning it here.

wojtekmach avatar Aug 12 '24 14:08 wojtekmach

One important consideration: retries. We should think of how to instrument those, because they should not be instrumented like the whole request pipeline. Otherwise, you might end up with requests that look like they took seconds because they were retried a few times. Just putting this out there.

One good example I've seen is in the stripe hex lib, where the retry count is one of the metadata fields. Works pretty ok.

aloukissas avatar Aug 12 '24 14:08 aloukissas

Yeah I think emitting events and having the retry information in the event itself would be enough for other libs to build stuff on top of them.

whatyouhide avatar Aug 13 '24 06:08 whatyouhide