gateway
gateway copied to clipboard
Return 500 when xPolicy translation fails
Description:
Describe the issue.
The current behaviors when xPolicies translation fails:
- SecurityPolicy: return 500, this change just introduced in PR https://github.com/envoyproxy/gateway/pull/3869 .
- BackendTrafficPolicy: Use a best-effort approach. The targeting xRoutes will still be translated and sent to the Envoy fleet.
- EnvoyExtensionPolicy: Also use a best-effort approach, with the targeting xRoutes being translated and sent to the Envoy fleet.
Options that we have:
For SecurityPolicy, it's reasonable to default to fail close as fail open poses a security risk. A configuration knob can be added to each SecurityPolicy, allowing users to customize this behavior if needed.
For BackendTrafficPolicy and EnvoyExtensionPolicy, there are two strategies:
- default to fail close: try the best to serve client traffic since service can still be functional to the user ends when these policies fail
- default to fail open: this aligns with the SecurityPolicy approach, maintaining consistency across policy behaviors.
We may also need to decide the default behavior for the ClientTrafficPolicy and EnvoyPatchPolicy failure, should we fail the targeted Gateways/Listeners?
-
[x] #4061
-
[x] #4062
-
[ ] #4087
-
[ ] #4153
[optional Relevant Links:]
Any extra documentation required to understand the issue.
- https://github.com/envoyproxy/gateway/issues/2837
- https://github.com/envoyproxy/gateway/pull/3869
im a +1 to return 500 if a CTP, BTP or EETP policy tied to a route cannot be translated . This would match the behavior of filters https://gateway-api.sigs.k8s.io/reference/spec/#gateway.networking.k8s.io%2fv1.HTTPRouteFilter
If a reference to a custom filter type cannot be resolved, the filter MUST NOT be skipped. Instead, requests that would have been processed by that filter MUST receive a HTTP error response.
+1 on this
+1
+1
Should we also consider BackendTLSPolicy here? I suppose that if a backend requires TLS and we fail to translate BTLSP, the upstream connection would just fail.
Maybe if/when gateway-level backend TLS configuration is supported, we will see a behavior of fallback to wrong gateway-level defaults that reduce security (e.g. use a system trust store instead of a specific trusted CA).
While not directly related, I think Extension Server should have a similar "fail close" option.
I'm using the Extension Server to automatically add a default Authz filter to all Listeners. If the Extension Server fails in some way (isn't available, fails during the processing), then the current behavior means that the Listener will be active without the filter (and without the security properties). With an option to "block" the Listener if the Extension Server fails, than the security boundary provided would be maintained.
While not directly related, I think Extension Server should have a similar "fail close" option.
I'm using the Extension Server to automatically add a default Authz filter to all Listeners. If the Extension Server fails in some way (isn't available, fails during the processing), then the current behavior means that the Listener will be active without the filter (and without the security properties). With an option to "block" the Listener if the Extension Server fails, than the security boundary provided would be maintained.
Hi @logan-hcg , There is a failOpen knob in the EnvoyExtensionPolicySpec, does it address this concern? https://gateway.envoyproxy.io/docs/api/extension_types/#envoyextensionpolicyspec
Hi @logan-hcg , There is a failOpen knob in the EnvoyExtensionPolicySpec, does it address this concern? https://gateway.envoyproxy.io/docs/api/extension_types/#envoyextensionpolicyspec
hi @alexwo , unfortunately that control knob is for Extension Polices, not Extension Manager / Extension Server (seems to be referred to interchangeably)
While not directly related, I think Extension Server should have a similar "fail close" option.
I'm using the Extension Server to automatically add a default Authz filter to all Listeners. If the Extension Server fails in some way (isn't available, fails during the processing), then the current behavior means that the Listener will be active without the filter (and without the security properties). With an option to "block" the Listener if the Extension Server fails, than the security boundary provided would be maintained.
@logan-hcg could you open a separate issue for the Extension Server
This issue has been automatically marked as stale because it has not had activity in the last 30 days.
This issue has been automatically marked as stale because it has not had activity in the last 30 days.
This issue has been automatically marked as stale because it has not had activity in the last 30 days.