Connection pool parameters at Endpoint level
Problem/Solution
Connection pooling options like the ones present in HttpProtocolOptions or Http2ProtocolOptions would be enforced at global or cluster level only. Being able to set some of those options only in a particular endpoint is desirable when tunnelling is involved in multi-cluster scenarios.
In Istio we have a use case for having Envoy cluster with endpoints that might benefit from different connection pooling properties for some endpoints. The reason for wanting different pooling properties is because some endpoints represent gateways and there might be multiple different endpoints behind the gateway, so ultimately, such endpoints represent multiple servers. Gateways in this case can't demultiplex the requests in the same HTTP2 connection and load-balance on the request level. Because of that we want to be able to control how many requests get multiplexed together, but, on the other hand, we'd like to avoid creating a separate endpoint for each server behind the gateway as it's wasteful compared to just giving the gateway an appropriate weight.
Proposal
Add an additional API to config.endpoint.v3.LbEndpoint, let's call it "connectionPoolingOptions" for now. That would allow us to set the following Http(2)ProtocolOptions at an endpoint level: max_concurrent_streams, max_requests_per_connection and idle_timeout. They would preserve the same current syntax and semantics, but they'd be enforced only at the endpoint where they were set. If these options are not present, we assume the current behaviour. Note that max_concurrent_streams would still effectively have the minimum value between what's set for the endpoint and what's taken from the negotiation with the server (through SETTINGS frame).
Also, we can add a boolean option propagate_negotiated_stream_limits. This option, with the default set to "true", when set to "false" would prevent stream limits negotiated through SETTING frames in HTTP/2 to be applied to the whole cluster, scoping the limits only to the respective endpoint.
Example:
...
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 192.168.0.1
port_value: 9084
- endpoint:
address:
socket_address:
address: 192.168.0.2
port_value: 9084
connection_pooling_options:
max_conccurent_streams: ...
max_requests_per_connection: ...
idle_timeout: ...
propagate_negotiated_stream_limits: ...
cc @wbpcode @markdroth as this seems a general high-level question about Endpoints
And just clarifying, me and my team are happy to work on this if it goes forward.
I think this is fine. But why do you call this "connection pooling" options? This looks like you just want per endpoint HTTP options?
@yanavlasov I think mostly because max_concurrent_streams and max_requests_per_connection are handled in the common base ActiveClient. But it does look like having a more HTTP related name would make more sense for people looking at the API.
@yanavlasov @wbpcode @markdroth
I'm considering a change in the proposed API. I think it's cleaner if we create our own typed HTTP protocol options. Something like this:
typed_extension_protocol_options:
envoy.extensions.upstreams.http.v3.HttpProtocolOptions:
"@type": type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions
upstream_http_protocol_options:
auto_sni: true
common_http_protocol_options:
idle_timeout: 1s
explicit_http_config:
http2_protocol_options:
max_concurrent_streams: 100
envoy.extensions.upstreams.http.v3.EpSpecificHttpProtocolOptions:
"@type": type.googleapis.com/envoy.extensions.upstreams.http.v3.EpSpecificHttpProtocolOptions
- http2_max_concurrent_streams: 10
http_max_requests_per_connection: 100
ep_metadata_match: ewgw
- ...
In the example above, envoy.extensions.upstreams.http.v3.HttpProtocolOptions is the current way we configure cluster-wise HTTP protocol options. If the user needs endpoint-specific configuration, they can add envoy.extensions.upstreams.http.v3.EpSpecificHttpProtocolOptions with multiple sets of specific configurations and for each specific configuration we have a string that can match against endpoint metadata.
The reason for this is it makes the coding simpler. We reuse the more modern approach with a typed extensions, and that makes the configuration available in the ClusterInfo (without any changes to its interface), accessible down the connection pool extensions.
I'll follow up with a draft PR.
PR: #42565