go-grpc-prometheus
go-grpc-prometheus copied to clipboard
Use stats handler instead of interceptor and add message size histograms
Hi guys,
Working on #85
I add stats_handlers on server-side and client side, there is some question need discuss with you:
- In the stats.Handler, I can't distinguish between unary and stream, so how do our metrics want labels to be named?
-
Is it to reuse existing metrics, default
grpc_typetounaryor create some new metrics withoutgrpc_typelabel. -
Or use some dynamic large methods, provide
grpc_typewhen users use interceptors, and nogrpc_typewhen users usestats.handler
- default bucket, current I set the default bucket like
[]float64{42, 2097, 52429, 1048576, 2097152, 3355443}, according to the ratio of the message size to the default maximum value, the specific ratio is0.00001, 0.0005, 0.0125, 0.025, 0.5, 0.8, do this make sense?
Strange, when the test is performed hard on 1.11 versions, some results are not as expected, I need to test again.
--- FAIL: TestClientStatsHandlerSuite/TestStreamingIncrementsMetrics (0.01s)
server_test.go:316: expected 1 grpc_client_handling_seconds value; got 3;
server_test.go:316: expected 2 grpc_client_handling_seconds value; got 4;
Codecov Report
Merging #88 into master will increase coverage by
4.22%. The diff coverage is84.39%.
@@ Coverage Diff @@
## master #88 +/- ##
==========================================
+ Coverage 72.45% 76.67% +4.22%
==========================================
Files 11 13 +2
Lines 363 566 +203
==========================================
+ Hits 263 434 +171
- Misses 89 115 +26
- Partials 11 17 +6
| Impacted Files | Coverage Δ | |
|---|---|---|
| client_metrics.go | 65.55% <72.72%> (+2.05%) |
:arrow_up: |
| server_metrics.go | 78.66% <76.00%> (-1.54%) |
:arrow_down: |
| client_stats_handler.go | 85.71% <85.71%> (ø) |
|
| server_stats_handler.go | 85.71% <85.71%> (ø) |
|
| client.go | 70.00% <100.00%> (+12.85%) |
:arrow_up: |
| client_reporter.go | 87.75% <100.00%> (+8.44%) |
:arrow_up: |
| server.go | 100.00% <100.00%> (ø) |
|
| server_reporter.go | 100.00% <100.00%> (ø) |
|
| util.go | 76.47% <100.00%> (+7.23%) |
:arrow_up: |
| ... and 1 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 8a6536e...36752fb. Read the comment docs.
@brancz @bwplotka I try to move all metrics to stats.Handler, please help me review this.
Any chance this PR could be looked at?
Hi 👋🏽
Sorry for the massive, lag. We consolidating projects and moving to single repo where we have more control and awareness. We are moving this code base longer to https://github.com/grpc-ecosystem/go-grpc-middleware/tree/v2 and we moved existing state of Prometheus middleware to https://github.com/grpc-ecosystem/go-grpc-middleware/tree/v2/providers/openmetrics
This means that before we release v2 this is the best opportunity to shape new https://github.com/grpc-ecosystem/go-grpc-middleware/tree/v2/providers/openmetrics with whatever we want to change in API 🤗 If you want to change something effectively long term, I would suggest switching your PR and rebasing it on https://github.com/grpc-ecosystem/go-grpc-middleware/tree/v2/providers/openmetrics instead (notice v2 branch, not master!).
Sorry for the confusion, but it's needed for the project sustainability. Cheers!
Can we avoid name "stats", it's not explicit, I am afraid. It sounds like we should have just size in name that's it, no? 🤗
Hi @bwplotka and thanks for restarting the conversation here!
This means that before we release
v2this is the best opportunity to shape new https://github.com/grpc-ecosystem/go-grpc-middleware/tree/v2/providers/openmetrics with whatever we want to change in API If you want to change something effectively long term, I would suggest switching your PR and rebasing it on https://github.com/grpc-ecosystem/go-grpc-middleware/tree/v2/providers/openmetrics instead (noticev2branch, not master!).
I see that https://github.com/grpc-ecosystem/go-grpc-middleware/tree/v2/providers/openmetrics already has additional metrics compared to this branch, but does this also mean it is intended as a replacement for this package? If so, can we expect that to be a drop-in replacement for prometheus metrics as used here?
Did this PR get moved to the new repo? I would love to have an easy way to monitor request size, sometimes the server goes under heavy load without the count of requests changing much and I hypothesize it may be because of bad actors submitting overly large payloads.