linkerd2-proxy icon indicating copy to clipboard operation
linkerd2-proxy copied to clipboard

Implement number of HTTP frames metric

Open nerodono opened this issue 1 year ago • 6 comments

Implements metric described in https://github.com/linkerd/linkerd2/issues/12319

nerodono avatar Jun 28 '24 13:06 nerodono

@nerodono, it seems some test can be added to linkerd/app/integration/src/tests/telemetry.rs

loyd avatar Jul 01 '24 07:07 loyd

@olix0r, could you look at this PR?

loyd avatar Jul 08 '24 07:07 loyd

Thanks for submitting this. There is work in flight (planned to merge soon) that will add route metrics. My preference would be to add this functionality here so that we can include parent/route/backend labels on these metrics. I'll update this PR when that is available.

olix0r avatar Jul 18 '24 17:07 olix0r

There is work in flight (planned to merge soon) that will add route metrics

Is there any news regarding this? I really don't want to live on a fork =(

loyd avatar Aug 30 '24 14:08 loyd

The first version of it has merged but it's not in a place where the frame metrics are really setup. We, coincidentally, had some reports of issues related to applications sending many very small frames, and so I'm planning to introduce a per-route data_size histogram so we can understand both the count of data frames transiting the proxy as well as a rough distribution of their sizes. This work was slightly delayed by some travel, etc. (so please excuse the silence), but it is actively being worked on as of this week. I'll update here when there is something to show.

olix0r avatar Aug 30 '24 16:08 olix0r

per-route data_size histogram

It sounds very useful, actually! Which GH issue is related to this work?

loyd avatar Aug 31 '24 05:08 loyd

@olix0r, Hi there! Any updates? I'm eagerly looking forward to the chance to start using the metrics!

jazvit avatar Nov 05 '24 04:11 jazvit

Hi @cratelyn and @olix0r,

I noticed this pull request has been open for a few months without any updates. Given that active development is ongoing in the repository, could you please take a look and provide feedback or consider merging it? It seems like it could benefit the project.

Thank you for your attention!

jazvit avatar Nov 18 '24 15:11 jazvit

Hi @cratelyn and @olix0r,

I noticed this pull request has been open for a few months without any updates. Given that active development is ongoing in the repository, could you please take a look and provide feedback or consider merging it? It seems like it could benefit the project.

Thank you for your attention!

hi there @jazvit!

the metrics that @olix0r should be landing shortly, and included in the next edge release. #3308 has landed and introduces a family of histograms labeled by backend to track response body frames. this can be used to measure the number of frames and the total number of bytes yielded, as well as a coarse distribution of the size of response body frames.

#3334 should be up for review shortly, which introduces an equivalent route-level histogram for request bodies.

thank you for patience while we've gotten these implemented. :slightly_smiling_face:

cratelyn avatar Nov 21 '24 15:11 cratelyn

This feature is implemented in recent edge releases (edge-24.11.6+)! Thanks for your patience.

olix0r avatar Nov 27 '24 15:11 olix0r

@olix0r thanks a lot!

I see that the "prometheus-client-rust-242" feature is not enabled by default: https://github.com/linkerd/linkerd2-proxy/blob/1653b08068b9af5b371d1a74deb49f9953e8b8eb/linkerd/app/outbound/Cargo.toml#L18 Is it required to build a custom linkerd2-proxy to use new metrics?

loyd avatar Dec 07 '24 08:12 loyd

@loyd no, you do not need to set that feature flag to use the new metrics.

this feature flag gates test assertions until upstream proposals in prometheus/client_rust#242 are released. you can see some examples of that here, for example:

https://github.com/linkerd/linkerd2-proxy/blob/1653b08068b9af5b371d1a74deb49f9953e8b8eb/linkerd/app/outbound/src/http/logical/policy/route/backend/metrics/tests.rs#L165-L170

this feature flag does not have any impact on the proxy's behavior, it's strictly related to test code.

cratelyn avatar Dec 09 '24 14:12 cratelyn