opentelemetry-go-contrib icon indicating copy to clipboard operation
opentelemetry-go-contrib copied to clipboard

otelgrpc: add custom attributes to the stats handler

Open inigohu opened this issue 1 year ago • 9 comments
trafficstars

Fixes https://github.com/open-telemetry/opentelemetry-go-contrib/issues/3894

inigohu avatar Feb 19 '24 15:02 inigohu

Friendly ping @dashpole @hanyuancheung

inigohu avatar Jun 03 '24 11:06 inigohu

This looks like it does a lot more than just add a labeler. Would you mind doing general refactoring in its own PR?

dashpole avatar Jun 03 '24 20:06 dashpole

This looks like it does a lot more than just add a labeler. Would you mind doing general refactoring in its own PR?

Sure. I have simplified this PR by adding only the labeler.

inigohu avatar Jun 04 '24 09:06 inigohu

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 65.6%. Comparing base (57e55dc) to head (aba5258). Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@          Coverage Diff          @@
##            main   #5133   +/-   ##
=====================================
  Coverage   65.5%   65.6%           
=====================================
  Files        203     203           
  Lines      12936   12946   +10     
=====================================
+ Hits        8484    8494   +10     
  Misses      4198    4198           
  Partials     254     254           
Files Coverage Δ
...entation/google.golang.org/grpc/otelgrpc/config.go 90.0% <100.0%> (+1.0%) :arrow_up:
...n/google.golang.org/grpc/otelgrpc/stats_handler.go 100.0% <100.0%> (ø)

codecov[bot] avatar Jun 05 '24 14:06 codecov[bot]

@dashpole PTAL

inigohu avatar Jun 17 '24 08:06 inigohu

I thought we were going to use a context-based pattern (similar to the otelhttp.Labeler) to support this? Am I misremembering?

Yes, sorry, I forgot to comment this last change. My first idea was to follow the otelhttp.Labeler strategy using the context to pass the labels, but when I was adding the tests I realised that maybe it doesn't make much sense to do it this way, because unlike http.Handler and grpc.UnaryInterceptors, you can't chain different grpc.StatsHandler to add labels and pass them through the context.

That's why I changed the approach. The behaviour differs from that of otelhttp.Labeler. That one is request/context based tagging and this one is server based tagging.

WDYT @dashpole ?

inigohu avatar Jun 18 '24 10:06 inigohu

A few thoughts/ideas:

  • If we use constant labels as you propose here, what is the use-case? Are constant labels actually useful?
  • Should the Labeler take a stats.RPCTagInfo or stats.ConnTagInfo as an argument?
  • If we still used the Labeler approach, could users set the labeler in the context using an interceptor? (Not sure if interceptors run before the stats handler?)

dashpole avatar Jun 18 '24 13:06 dashpole

If we use constant labels as you propose here, what is the use-case? Are constant labels actually useful?

Static attributes could be useful for tagging different servers running on the same service but on different ports, e.g. public, internal, backoffice, legacy...

Should the Labeler take a stats.RPCTagInfo or stats.ConnTagInfo as an argument?

These attributes are already being tagged in the spans/metrics.

If we still used the Labeler approach, could users set the labeler in the context using an interceptor? (Not sure if interceptors run before the stats handler?)

The HandleRPC of the StatsHandler is executed several times, but to summarise this would be the order:

HandleRPC(begin) -> Interceptors(pre) -> handler logic -> Interceptors(post) -> HandleRPC(end).

The problem is that despite modifying the context in an interceptor, the interceptor does not propagate it outwards, so it never reaches the end of the HandleRPC.

inigohu avatar Jun 19 '24 10:06 inigohu

Hi, I've ended up here after trying to figure out how to attach custom (static) labels to otelgrpc stats. Just wanted to add our concrete use case if it's helpful to demonstrate the need.

We've got a central cluster that talks to the same service on multiple remote clusters, and I'd like to be able to label and differentiate remote clusters so I can examine their latency metrics separately.

From what I've been able to determine poking around the codebases, attaching a different client handler with a different set of static labels/attributes on it to each of our outbound grpc client connections would let us distinguish between the connections after everything rolls up to the client latency metric.

The only other way I can see to achieve that is to have different MeterProviders with different labels, but then I believe they would conflict on the metric names if I were to try and publish the metrics through the same exporter (the otel prometheus exporter in this case).

jklipp avatar Jun 28 '24 20:06 jklipp

I think this change is a good idea. @hanyuancheung any thoughts?

dashpole avatar Jul 09 '24 20:07 dashpole

@dmathieu @dashpole Sorry to interrupt, but when will this change be released (merged)?

zchee avatar Aug 22 '24 03:08 zchee