gateway icon indicating copy to clipboard operation
gateway copied to clipboard

Status Updater Fails to Update Status

Open danehans opened this issue 2 years ago • 3 comments

I see the following error in the EG logs when creating a gateway:

2022-09-23T09:35:03.806-0700	ERROR	status/status.go:81	unable to update status	{"runner": "provider", "name": "eg", "namespace": "default", "error": "Operation cannot be fulfilled on gateways.gateway.networking.k8s.io \"eg\": the object has been modified; please apply your changes to the latest version and try again"}

Note that a gatewayclass was already created.

@arkodg do you think https://github.com/envoyproxy/gateway/pull/395 may be updating gateway status before making a copy of it?

danehans avatar Sep 23 '22 16:09 danehans

I see the same issue with httproutes:

2022-09-23T17:01:21.232Z	ERROR	status/status.go:81	unable to update status	{"runner": "provider", "name": "httpbin", "namespace": "default", "error": "Operation cannot be fulfilled on httproutes.gateway.networking.k8s.io \"httpbin\": the object has been modified; please apply your changes to the latest version and try again"}

danehans avatar Sep 23 '22 17:09 danehans

I think the status updater library needs to be updated to

  • perform a GET before a PUT
  • also made synchronous (currently asynchronous, runs in a go routine)

arkodg avatar Sep 23 '22 18:09 arkodg

@arkodg do you think #395 may be updating gateway status before making a copy of it?

@danehans it passes a copy already https://github.com/envoyproxy/gateway/blob/25acee55e46f3cd022157037697b08844d96a55d/internal/provider/kubernetes/gateway.go#L351

arkodg avatar Sep 23 '22 18:09 arkodg

shouldn't this https://github.com/envoyproxy/gateway/blob/f094c5bc02c8c00b1ca668586210b2c202e96b18/internal/status/status.go#L62 lead to retries where it GETs the object and tries applying status over it again? If the updates for the same object are being sent asynchronously, I think it would retry and eventually update the correct status after a few retries?

chauhanshubham avatar Sep 25 '22 10:09 chauhanshubham

I think status updating can be async, as long as:

  • it gets passed an operation to do on the status (I think EG does this already)
  • it GETs the object from the controller-runtime cache, then applies the status operation, then checks if things have changed, and only applies if it has.
  • There's only one status updater running

That is, the status updater can handle the queue of updates asynchronously, but each update attempt needs to get the latest version first, then perform any checks about the status, then update if things have changed. Contour's Gateway processing is a good example of the pattern: https://github.com/projectcontour/contour/blob/e8125fa21d0b35faa39606f577c0767ca863cb91/internal/controller/gateway.go#L272

Note that it calls the setGatewayNotScheduled function during execution of the MutatorFunc, which means that the update operation will always be done on version when the MutatorFunc runs, rather that the version that's used when the MutatorFunc was created.

youngnick avatar Sep 26 '22 05:09 youngnick

@youngnick currently the producers (such as gateway-api translator) update the entire object including status and pass it over to the status subscriber, so we would have to update logic in all the producers to collect the list of updated status conditions and pass those over to the status subscriber so that the conditions are updated in the mutator func and not before. I think that is something that is needed long term (because runners such as the xds-server do not have any reference to the gateway object), but it was simpler to convert the status updater into a lib to make the status updates synchronous within the status subscriber go routine so went ahead with that instead for the 0.2.0 release https://github.com/envoyproxy/gateway/pull/429

arkodg avatar Sep 27 '22 17:09 arkodg

please ignore the previous comment, @youngnick took your advice and merged the gateway conditions within the mutator func https://github.com/envoyproxy/gateway/pull/435 thanks !

arkodg avatar Sep 27 '22 18:09 arkodg