gateway
gateway copied to clipboard
HTTPRoute Status Not Updated After Adding Backend
I create a gatewayclass, gateway, and httproute. EG surfaces an error:
2022-09-23T16:51:34.608Z ERROR controller/controller.go:326 Reconciler error {"runner": "provider", "controller": "httproute", "object": {"name":"httpbin","namespace":"default"}, "namespace": "default", "name": "httpbin", "reconcileID": "1d30e643-0896-4171-9323-7fc6aa75e865", "error": "failed to get service default/httpbin"}
I then create the backend service but the route's status is never updated:
...
status:
parents:
- conditions:
- lastTransitionTime: "2022-09-23T16:51:30Z"
message: Route is accepted
observedGeneration: 1
reason: Accepted
status: "True"
type: Accepted
- lastTransitionTime: "2022-09-23T16:51:30Z"
message: Service default/httpbin not found
observedGeneration: 1
reason: BackendNotFound
status: "False"
type: ResolvedRefs
controllerName: gateway.envoyproxy.io/gatewayclass-controller
parentRef:
group: gateway.networking.k8s.io
kind: Gateway
name: eg
cc: @arkodg @skriss
I'm seeing this issue even when the backend is created before the httproute.
I was able to work around the issue by deleting the httproute, gateway, and gatewayclass. After recreating the gatewayclass, gateway, and httproute (after creating the backend) the httproute surfaced the proper status conditions:
status:
parents:
- conditions:
- lastTransitionTime: "2022-09-23T17:10:28Z"
message: Route is accepted
observedGeneration: 1
reason: Accepted
status: "True"
type: Accepted
controllerName: gateway.envoyproxy.io/gatewayclass-controller
parentRef:
group: gateway.networking.k8s.io
kind: Gateway
name: eg
sounds like an issue in the provider reconciler, maybe another reason to implement https://github.com/envoyproxy/gateway/issues/413 sooner than later
tried this out and can repro the issue, it is tied to the logic behind how status is computed in the Gateway API translator. Currently the translator only merges/appends new status conditions https://github.com/envoyproxy/gateway/blob/765e14ea3bcc949f4a98775429d33dbb792042f1/internal/gatewayapi/contexts.go#L267, but does not reset older ones previously created (in this case https://github.com/envoyproxy/gateway/blob/765e14ea3bcc949f4a98775429d33dbb792042f1/internal/gatewayapi/translator.go#L581)
@skriss can we reset the Status.Parents field (which might cause some status churn due to the change in LastTransitionTime) or do we need to parse through the status field and clear any previously set status condition ?
The other option would be to actually set the ResolvedRefs condition to true (rather than omitting it) when all refs can be resolved. This would be inline with upcoming Gateway API 0.6.0 changes (https://github.com/youngnick/gateway-api/blob/3b01d971e9a60de30e30a59b7aa2d5bf12e8018e/site-src/geps/gep-1364.md#proposed-changes-summary), where we'll be required to do that. I think that might be the simplest change, but would have to look in some more detail to verify.
@skriss what about the other status conditions (within the Gateway and HTTPRoute) set by the translator in the first iteration but have changed in the second iteration (due to change in resources) ?
Yeah you may have similar issues for those. I'd say clearing the existing conditions before translate is probably a good first step, but yeah ultimately we probably need more intelligent condition merge logic to handle all the cases of adding, updating, removing conditions properly.
Yeah you may have similar issues for those. I'd say clearing the existing conditions before translate is probably a good first step, but yeah ultimately we probably need more intelligent condition merge logic to handle all the cases of adding, updating, removing conditions properly.
yeah I think if we reset the previous conditions in the translator and merge the conditions (which will ensure an existing condition remains unchanged ) in the status subscriber we can compute the correct conditions w/o performing excessive writes against the API Server
The other option would be to actually set the ResolvedRefs condition to true (rather than omitting it) when all refs can be resolved. This would be inline with upcoming Gateway API 0.6.0 changes (https://github.com/youngnick/gateway-api/blob/3b01d971e9a60de30e30a59b7aa2d5bf12e8018e/site-src/geps/gep-1364.md#proposed-changes-summary), where we'll be required to do that. I think that might be the simplest change, but would have to look in some more detail to verify.
#507 does just this. I'm concerned about making major changes at this point in the release, so I prefer #507 gets merged and we iterate to improve status. @arkodg I am unclear what the bigger issue is that you describe in https://github.com/envoyproxy/gateway/issues/415#issuecomment-1266127024 and what is meant by "clearing" or "reset" status conditions. Typically, status conditions are set and then updated based on the current state and not cleared or reset. It would be helpful if you could provide a reproducer for the bigger issue you describe.
@arkodg thanks for taking the time to help me understand your comment. I now see that the gatewayapi translator needs to handle updating the other condition types as well. I think it's best to have the gatewayapi translator build the conditions list from scratch every time, send the conditions to the provider in a StatusUpdate and have the provider only update the object if the list differs (excluding lastTransitionTime).