linkerd2-proxy-api
linkerd2-proxy-api copied to clipboard
outbound: add HTTP retry policy configuration
This branch adds fields for configuring HTTP retries to the OutboundPolicies API. In particular, it adds RetryBudget messages to the ProxyProtocol.Http1, ProxyProtocol.Http2, and ProxyProtocol.Grpc messages, and it adds a new RetryPolicy message to the HttpRoute.RouteBackend message.
The same RetryBudget message use by ServiceProfiles is reused here, while the RetryPolicy message is added specifically for the OutboundPolicies proto, and consists of a list of retryable status ranges, and a maximum number of retries allowed.
The RetryPolicy field is added to the HttpRoute.Rule message, allowing retry policies to be configured at the level of HTTPRoute rules. Note that no RetryPolicy field is added to the RouteBackend message. This is because the RetryPolicy contains a per-request retry limit, and it's important to ensure that the per-request retry limit is determined at the time of the initial request. Since we can, and probably will, retry a request by sending it to a different backend from the one that failed the initial request, backend selection cannot determine the value of the per-request retry limit. Instead, it must be determined when the request is first matched to a route. This implies that the policy controller will probably have to reject HTTPRoutes where RetryFilters are present in a backend's list of filters rather than in the rule's list of filters.
It occurred to me that we could also add a provision for configuring backoffs associated with retries using the existing ExponentialBackoff messages, although that wasn't discussed in the RFC proposing OutboundPolicies retry configuration...we could always add backoff configs in a follow-up, as well. However, if we want the control plane to be responsible for providing the defaults, rather than the proxy, it might be worth including the backoff now?