opentelemetry-collector icon indicating copy to clipboard operation
opentelemetry-collector copied to clipboard

Change default otlp exporter GRPC load balancer to round robin

Open taniyourstruly opened this issue 1 year ago • 4 comments
trafficstars

Description

Updates the default pick_first load balancer to round_robin, which allows for better resource allocation and less chances of throttling data being sent to addresses.

Link to tracking issue

Fixes #10298 (has full context of this PR)

Testing

Edited tests to allow round_robin load balancing.

taniyourstruly avatar Jun 04 '24 21:06 taniyourstruly

@jpkrohling can you explain the reasoning behind that for this specific scenario? Per Tani's issue: pick_first is never the right choice:

Also, pick_first does no actual load balancing (link), and instead just tries each address from the name resolver and connects to the one that works.

gRPC can't change their default because of legacy clients they need to support / the edge cases are much larger. Given that the collector serves to send data to an arbitrary backend, and most destinations being sent to would benefit from more evenly balanced load, I'm not sure why we would stick with a choice that balances load poorly. I would argue that today we are not setting a reasonable default for users.

jaronoff97 avatar Jun 07 '24 20:06 jaronoff97

The base for the reasoning is that we try to keep our config with the value that would cause the least surprise to our users, and that's typically the default we have from the underlying library. Another example that I prefer a different default is for the Min TLS version, which is (IIRC) 1.2 from Go, but we could have it set to 1.3 by default.

That said, I'm open to changing the default value for this, deviating from the underlying library, if we have OKs from maintainers (cc @open-telemetry/collector-maintainers ).

jpkrohling avatar Jun 10 '24 10:06 jpkrohling

As much as I agree with having round_robin as the default, we try to stick with the default values from the underlying libraries.

I would be ok with changing the default and documenting the reasoning for it in the components affected by this change. I think it's desirable for end users that the collector provides defaults that make their experience better. As per the research done in https://github.com/open-telemetry/opentelemetry-collector/issues/10298, I think this qualifies.

codeboten avatar Jun 12 '24 16:06 codeboten

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 92.39%. Comparing base (3996c58) to head (a8ed955). Report is 8 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10319      +/-   ##
==========================================
- Coverage   92.55%   92.39%   -0.16%     
==========================================
  Files         387      387              
  Lines       18284    18313      +29     
==========================================
- Hits        16922    16921       -1     
- Misses       1017     1046      +29     
- Partials      345      346       +1     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Jun 12 '24 16:06 codecov[bot]