go-grpc-prometheus
go-grpc-prometheus copied to clipboard
Make EnableClient* methods concurrent safe
In one of my apps, I'm hitting a data race error in EnableClientHandlingTimeHistogram
. It looks like the problem is that the EnableClient*
methods are not concurrent safe.
The problem is triggered when two grpc clients each call grpc_prometheus.EnableClientHandlingTimeHistogram()
. They end up sharing the same DefaultClientMetrics
which results in two calls to DefaultClientMetrics.EnableClientHandlingTimeHistogram()
. That causes the race detector to throw an error about that method not being concurrent safe.
Rather than adding a generic lock, I used sync.Once
since it looks like these are only intended to be called once.
Codecov Report
Merging #90 into master will decrease coverage by
0.32%
. The diff coverage is25.92%
.
@@ Coverage Diff @@
## master #90 +/- ##
==========================================
- Coverage 72.45% 72.13% -0.33%
==========================================
Files 11 11
Lines 363 366 +3
==========================================
+ Hits 263 264 +1
- Misses 89 91 +2
Partials 11 11
Impacted Files | Coverage Δ | |
---|---|---|
client_metrics.go | 62.85% <25.92%> (-0.65%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 9abf3eb...7961b86. Read the comment docs.
nudge @brancz any chance you could look at this? Should be a relatively low-risk change.
Hi @jeffwidman,
Thanks for this, but I think it would be awesome if we can do this on v2 we plan... and the plan big as we design v2 of prometheus middleware to go inside go-grpc-middleware v2 So... I might think your work will be a bit wasted if we would merge this ):
But!
If you want, you can help us to move prometheus middleware to https://github.com/grpc-ecosystem/go-grpc-middleware/tree/v2
We need a brave volunteer! (: I just created an issue for this: https://github.com/grpc-ecosystem/go-grpc-prometheus/issues/91
Hmm... The new v2 plan looks interesting, and I'll need to read more.
However, this is a bugfix, and a non-breaking one.
So why not merge it to v1? #91 mentions continuing to accept bugfixes to the v1 branch for a period of time during the migration period.
In fact, merging to v1 now would make sure it is included in both v1 and v2 stuff, so the fix wouldn't need to later be backported.
Again, I think migrating this to be part of all the go-grpc-middleware
might be interesting (I need to read/think on it more), all I'm saying I don't see how that affects whether to merge this given that this will apply to both v1 and v2.
nudge @bwplotka what do you think about my comment ☝️...
In fact, merging to v1 now would make sure it is included in both v1 and v2 stuff, so the fix wouldn't need to later be backported.
In fact, merging to v1 now would make sure it is included in both v1 and v2 stuff, so the fix wouldn't need to later be backported.
Not really, it won't be included in v2. We already split branches.
And we have totally separate implementation so it won't matter much. And yes, if it's a bug we can merge.. But is it?
BTW I am almost sure you might be using this library wrongly. .EnableClientHandlingTimeHistogram()
is literally config option. You should use it once in your application. Why you use it concurrently?
BTW I am almost sure you might be using this library wrongly.
.EnableClientHandlingTimeHistogram()
is literally config option. You should use it once in your application. Why you use it concurrently?
It happens when running unit tests if two unit tests both spin up the server (which is a common thing). If you don't run go test -p 1 ./...
then go
runs the unit tests within separate packages concurrently causing the conflict.
Hi 👋🏽
Sorry for the massive, lag. We consolidating projects and moving to single repo where we have more control and awareness. We are moving this code base longer to https://github.com/grpc-ecosystem/go-grpc-middleware/tree/v2 and we moved existing state of Prometheus middleware to https://github.com/grpc-ecosystem/go-grpc-middleware/tree/v2/providers/openmetrics
This means that before we release v2
this is the best opportunity to shape new https://github.com/grpc-ecosystem/go-grpc-middleware/tree/v2/providers/openmetrics with whatever we want to change in API 🤗 If you want to change something effectively long term, I would suggest switching your PR and rebasing it on https://github.com/grpc-ecosystem/go-grpc-middleware/tree/v2/providers/openmetrics instead (notice v2
branch, not master!).
Sorry for the confusion, but it's needed for the project sustainability. Cheers!
Regarding the change I think I understand it, cool. Please check v2, because we changed the code to never use globals, so things should be concurrency friendly