opentelemetry-collector
opentelemetry-collector copied to clipboard
Change default otlp exporter GRPC load balancer to round robin
Description
Updates the default pick_first load balancer to round_robin, which allows for better resource allocation and less chances of throttling data being sent to addresses.
Link to tracking issue
Fixes #10298 (has full context of this PR)
Testing
Edited tests to allow round_robin load balancing.
@jpkrohling can you explain the reasoning behind that for this specific scenario? Per Tani's issue: pick_first is never the right choice:
Also, pick_first does no actual load balancing (link), and instead just tries each address from the name resolver and connects to the one that works.
gRPC can't change their default because of legacy clients they need to support / the edge cases are much larger. Given that the collector serves to send data to an arbitrary backend, and most destinations being sent to would benefit from more evenly balanced load, I'm not sure why we would stick with a choice that balances load poorly. I would argue that today we are not setting a reasonable default for users.
The base for the reasoning is that we try to keep our config with the value that would cause the least surprise to our users, and that's typically the default we have from the underlying library. Another example that I prefer a different default is for the Min TLS version, which is (IIRC) 1.2 from Go, but we could have it set to 1.3 by default.
That said, I'm open to changing the default value for this, deviating from the underlying library, if we have OKs from maintainers (cc @open-telemetry/collector-maintainers ).
As much as I agree with having round_robin as the default, we try to stick with the default values from the underlying libraries.
I would be ok with changing the default and documenting the reasoning for it in the components affected by this change. I think it's desirable for end users that the collector provides defaults that make their experience better. As per the research done in https://github.com/open-telemetry/opentelemetry-collector/issues/10298, I think this qualifies.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 92.39%. Comparing base (
3996c58) to head (a8ed955). Report is 8 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #10319 +/- ##
==========================================
- Coverage 92.55% 92.39% -0.16%
==========================================
Files 387 387
Lines 18284 18313 +29
==========================================
- Hits 16922 16921 -1
- Misses 1017 1046 +29
- Partials 345 346 +1
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.