gateway Handle Backend Pod Upgrades

Handle Backend Pod Upgrades

Open arkodg opened this issue 1 year ago • 6 comments

If 2 backend pods are undergoing a rolling restart, add outlier detection and retry settings in Envoy proxy to ensure no traffic intended for the backend is dropped

Aug 02 '23 05:08 arkodg

Since we have clusterIP as backend to the envoy gateway, Isn't it k8s service responsibility to do this. What am I missing here?

Aug 02 '23 10:08 tanujd11

@tanujd11 you're right now, at this point, this is moot since we have only have ClusterIP endpoint, but once EndpointSlice support (https://github.com/envoyproxy/gateway/pull/1494) lands, we'll have to handle the case where the control plane / EG is not fast enough to propagate current Ready endpoints to Envoy Proxy so we'll need to add some mechanism in Envoy to deal with this eventual consistency such as outlier detection and trying out another endpoint in the xds cluster

Aug 02 '23 16:08 arkodg

Hi @arkodg I can pick this up. Since we have EndpointSlice support enabled, What about having outlierDetection API in backendTrafficPolicy API?

Nov 02 '23 15:11 tanujd11

awesome thanks @tanujd11 . I would break this up into 3 parts

e2e to make sure pod rolling restart is hitless
outlier detection API (lets create an issue if one doesn't exist)
whether to enable it or not (by default)

Nov 02 '23 16:11 arkodg

This issue has been automatically marked as stale because it has not had activity in the last 30 days.

Dec 02 '23 20:12 github-actions[bot]

This issue has been automatically marked as stale because it has not had activity in the last 30 days.

Jan 06 '24 20:01 github-actions[bot]

gateway gateway copied to clipboard

Handle Backend Pod Upgrades

gateway
gateway copied to clipboard