design-cfps icon indicating copy to clipboard operation
design-cfps copied to clipboard

add CFP CEP-33767-make-custom-envoy-implementation-work-with-gateway-api

Open Havnevej opened this issue 1 year ago • 2 comments

Add CFP

Havnevej avatar Jul 12 '24 13:07 Havnevej

cc: @cilium/sig-servicemesh

xmulligan avatar Sep 13 '24 11:09 xmulligan

This CFP proposes to allow user to specify CEC that would somehow feed into Gateway (as in Gateway API) creation so that the filter chains in the user specified CEC would be used instead of the currently autogenerated ones.

I have two high level comments to make at this time:

  • We (the service mesh team) have strived to not go the path of Ingress implementations went with (custom) annotations modifying the Ingress behavior, indeed, I believe this has been one of the design goals of the Gateway API itself. As such it would be better to see a design proposal upstream at the Gateway API for an extension that adds the needed optional extra functionality

  • Cilium Envoy Config is not a stable API, and so far this has been intentional, giving us the freedom to make breaking changes as needed for better implementation of Ingress, Gateway API, etc.

jrajahalme avatar Oct 09 '24 15:10 jrajahalme

Thanks for that @jrajahalme.

I'm sorry to take so long to come back to this one, I've been struggling to find the best way to explain this.

I understand that this change seems like a small one that will unlock further features changes, but I'm reluctant to proceed because it means that we will need to support users using arbitrary CECs forever. As @jrajahalme says, we've been working towards pulling CEC further back behind the curtain and using Gateway API and GAMMA to replace the current functionality, for a few reasons:

  • CEC processing is very user-unfriendly, with no parsing of the Envoy config and so no information about any errors or mistakes. You have to construct the config, then check the Envoy logs to see if it has been ACKed successfully. Adding this functionality would also be a lot of work, and until then, troubleshooting errors with CEC is a very bad user experience.
  • The openness of CEC processing also makes it hard for both users and code maintainers to be able to tell what functionality is supported or not. CEC objects are not the only Envoy config sent to the proxy - they are wrapped with other, Cilium specific config that could conflict. Whose job is it to fix that if that happens? It's not the user's fault - they have no way other than a detailed reading of the Cilium code to understand what's happening, but equally, as maintainers we can't troubleshoot every possible combination of Envoy resources.
  • Lastly, other Envoy-based projects (like Istio) also regret enabling raw Envoy access - see https://blog.howardjohn.info/posts/opinionated-istio/ for some more info, but there, the author compares using raw Envoy config to providing a "fast-moving project a git diff that is patched dynamically and recompiled; EnvoyFilter is only slightly more stable than that". CEC processing is very similar.

I really don't like saying no, but all of these reasons are why I'm very reluctant to accept this CFP. I really want us to spend engineering effort instead on the more sustainable solution (which in the main use case here would be getting Gateway API and GAMMA to support JWT auth config somehow.

Again, I really appreciate the time and effort required to write this CFP, but I don't think that the value delivered for users of Cilium as a whole is enough to offset the significant future cost in supportability that adding this feature would entail.

youngnick avatar Oct 22 '24 01:10 youngnick