gateway-api icon indicating copy to clipboard operation
gateway-api copied to clipboard

Adds API to GEP-91: Client Certificate Validation

Open arkodg opened this issue 1 year ago • 21 comments

What type of PR is this?

/kind gep

What this PR does / why we need it: What Defines an API in GEP-91 to support Client Certificate Validation for TLS Terminating at the Gateway using a CA Certificate as a trusted anchor.

Why Allows the platform admin configuring the Gateway resource to only select trusted clients to connect to the Gateway

Relates to https://github.com/kubernetes-sigs/gateway-api/issues/91 & https://github.com/kubernetes-sigs/gateway-api/issues/2110

arkodg avatar Aug 07 '23 21:08 arkodg

highlighting the key decisions that need to be made here

  • naming -
    • clientValidation to hold client specific information to validate
    • caCertificateRefs - to hold the CA certs
  • Secret vs ConfigMap to hold the CA Cert
  • key name for the CA Cert, ca.crt is what this GEP proposes
  • allowing these resources to exist in a separate namespace or not - GEP proposal allows this using ReferenceGrant similar to what is done for the certificateRefs field

arkodg avatar Aug 09 '23 22:08 arkodg

Secret vs ConfigMap to hold the CA Cert

IMHO, the CA cert only contains the public key part so a ConfigMap is safe enough to store it. Sometimes, people store the CA cert in the Secret because they also store the private key in a different entry of the same object.

spacewander avatar Aug 10 '23 03:08 spacewander

Secret vs ConfigMap to hold the CA Cert

IMHO, the CA cert only contains the public key part so a ConfigMap is safe enough to store it. Sometimes, people store the CA cert in the Secret because they also store the private key in a different entry of the same object.

I think putting the CA cert in a separate object would be a better practice (so only the CA cert will be referenced). If I have to choose one from Secret and ConfigMap, I will vote for ConfigMap.

spacewander avatar Aug 10 '23 03:08 spacewander

It's conceivable that we may want to add the ability to use a CA cert from an issues TLS Secret (using the not-quite-standard ca.crt key in a kubernetes.io/tls type Secret), but that's not a blocker, since we can always add that later.

Additionally, this change should move this GEP to Implementable, I think, and then once the types are present in the Experimental channel, this GEP will move to Experimental.

youngnick avatar Aug 17 '23 04:08 youngnick

I think this also Fixes #2110

frankbu avatar Aug 22 '23 18:08 frankbu

@robscott @youngnick @shaneutt any edits I can make to this PR to help it move forward ? I did notice that the Backend TLS PR has merged which defined the cert refs as a ConfigMapObjectReference . This PR took the approach of using ObjectReference to give implementations the flexiblity to define the ca.cert in a ConfigMap or as a Secret (within the same TLS Cert used to define server certs). I can go ahead and make the change to this PR to keep the approach identical across the apis.

arkodg avatar Sep 18 '23 01:09 arkodg

I did notice that the https://github.com/kubernetes-sigs/gateway-api/pull/2113 has merged which defined the cert refs as a ConfigMapObjectReference

I think the question for both this and #2113 is do we have a preferred default kind for certs, and if so, should it be ConfigMap or Secret? If Secret, maybe both GEPs should be using SecretObjectReference?

frankbu avatar Sep 25 '23 13:09 frankbu

I did notice that the #2113 has merged which defined the cert refs as a ConfigMapObjectReference

I think the question for both this and #2113 is do we have a preferred default kind for certs, and if so, should it be ConfigMap or Secret? If Secret, maybe both GEPs should be using SecretObjectReference?

For reference, we temporarily chose ConfigMapObjectReference because the usage didn't require tls.key and tls.crt. https://github.com/kubernetes-sigs/gateway-api/pull/2113#discussion_r1231584950 mentioned the use of ClusterTrustBundle, which seemed popular.

candita avatar Sep 26 '23 21:09 candita

Hi, does anyone have a sense whether this PR is close to approval and merging? We're trying to establish guidelines and a road map for our applications that use Gateway ingress with downstream client certificates. We began using Contour as ingress controller for multiple applications, along with Gateway API, but a subset of those apps need client certificates from the end user. We're currently able to support this with Contour, but that subset of apps has to deviate from the others in that they can't use Gateway API's HTTPRoute, and instead have to use Contour-specific APIs (HTTPProxy) to trigger client certificate requests. It would be great to have an idea whether this is a good short-term strategy and whether it's likely that we could consolidate our apps around HTTPRoute in the future. Thanks!

jdoylei avatar Dec 11 '23 14:12 jdoylei

this PR was blocked until Gateway API went GA, now that it has, will address comments and push another commit this week

arkodg avatar Dec 18 '23 23:12 arkodg

Once #2689 merges, this will need a rebase and update - the GEP files have moved.

youngnick avatar Dec 22 '23 02:12 youngnick

@arkodg @candita @youngnick Can we move this GEP forward soon?

frankbu avatar Feb 01 '24 16:02 frankbu

another point for reviewers of this PR, the field is called CACertificateRefs Pros:

  • it looks and feels similar to CertificateRefs within the Gateway TLS spec Cons
  • It looks different from the CA Cert field in the BackendTLSPolicy which is called CACertRefs

please let me know if this needs to be changed

arkodg avatar Feb 02 '24 00:02 arkodg

updated the PR based on review comments ptal @youngnick @frankbu @candita

arkodg avatar Feb 02 '24 01:02 arkodg

another point for reviewers of this PR, the field is called CACertificateRefs

I prefer this longer name. Why isn't BackendTLSPolicy using the longer name also, to be consistent with the Gateway TLS spec?

cc @candita

frankbu avatar Feb 02 '24 15:02 frankbu

seeing this failure locally

Verifying ./hack/../hack/verify-docker-build.sh
FAILED

as well as in the CI

Verifying docker images
ERROR: no builder "gateway-api-builder" found
gateway-api-builder
Building and pushing echo-advanced image (from Istio) ...linux/amd64,linux/arm64
#0 building with "gateway-api-builder" instance using docker-container driver
 #1 [internal] booting buildkit
181 skipped lines...
 #18 [linux/amd64 stage-1 2/3] COPY --from=builder /go/src/sigs.k8s.io/gateway-api/conformance/echo-basic/echo-basic /
#18 DONE 0.0s
 #17 [linux/arm64 builder 4/4] RUN go build -trimpath -ldflags="-buildid= -s -w" -o echo-basic .
{"component":"entrypoint","file":"k8s.io/test-infra/prow/entrypoint/run.go:169","func":"k8s.io/test-infra/prow/entrypoint.Options.ExecuteProcess","level":"error","msg":"Process did not finish before 2h0m0s timeout","severity":"error","time":"2024-02-06T18:37:12Z"}
++ early_exit_handler
++ '[' -n 183 ']'
++ kill -TERM 183
++ cleanup_dind
++ [[ true == \t\r\u\e ]]
++ echo 'Cleaning up after docker'
Cleaning up after docker
make: *** [Makefile:118: verify] Terminated
++ docker ps -aq
++ xargs -r docker rm -f
WARNING: No output specified with docker-container driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load
ERROR: failed to receive status: rpc error: code = Unavailable desc = error reading from server: EOF
4dbeda0d9a12

feels unrelated to this change

arkodg avatar Feb 06 '24 19:02 arkodg

To summarize where this one is at, I think we have two outstanding things:

  • the discussion about per-Listener or per-Gateway
  • the naming of the field and its struct, is "clientValidation" the right one? If it is, we probably need to update our terminology in talking about backend TLS to use something that's not "client". If not, we need a new word to use other than "client". My past suggestion was "user agent" (by analogy to the browser convention). What we choose is not as important as choosing something, and agreeing on the new terminology.

youngnick avatar Feb 26 '24 03:02 youngnick

the naming of the field and its struct, is "clientValidation" the right one?

IMO, this is the right name, it's even consistent with the diagram in this section of the backend TLS doc: https://gateway-api.sigs.k8s.io/geps/gep-1897/#overview-what-do-we-want-to-do. I do agree that the backend doc could use some terminology improvements to be clear that there are 2 clients in the request flow: 1) the client of the gateway (the request client), and 2) the client of the backend (which happens to be the gateway iteself). Maybe just say gateway when referring to the backend client?

frankbu avatar Feb 26 '24 15:02 frankbu

the discussion about per-Listener or per-Gateway

big +1 for per Listener, since this Gateway implementation may be fronting multiple domains and sub domain on different listeners, requiring authentication & verification to be more fine grained

arkodg avatar Feb 26 '24 21:02 arkodg

We don't have an authz API (yet) - but each vendor is likely to have one. Not having Authz doesn't mean we should sneak it in the Authn API :-)

"Verify identity" for both certificates and JWTs means to verify the cert is valid, signed by an allowed entity - and the fields are in the right place and syntax. Checking if a particular user is allowed is not part of verifications that Authentication is supposed to perform - that's authorization.

One of the reasons we don't have an authz API is because authz is a much harder problem and doesn't have a common solution - in particular for ingress use cases where we may deal with 1M or 1B users that authenticate. There are a few common ones - LDAP/ActiveDirectory, database integrations are common in web servers - and IAM from different vendors.

Checking 'htpasswd' or hardcoded user IDs in the http server config was common in the '90s - but it's probably not something we should turn into a standard in 2024 :-)

On Mon, Apr 1, 2024 at 9:44 AM John Howard @.***> wrote:

@.**** commented on this pull request.

In geps/gep-91/index.md https://github.com/kubernetes-sigs/gateway-api/pull/2273#discussion_r1546578084 :

+| Contour | HTTPProxy.Spec.VirtualHost.Tls.ClientValidation.CASecret | +| Emissary Ingress| TlSContext.Spec.Secret | +| Gloo Edge | VirtualService.Spec.SSLConfig.SecretRef | +| Istio | Gateway.Spec.Servers.TLS.Mode | +| Kong | mTLS Plugin | +| Traefik | TLSOption.Spec.ClientAuth |

+### API + +* Introduce a Client field of type TLSClientContext within [GatewayTLSConfig][] to hold TLS configuration specific to the client. +* Introduce a Validation field of type TLSValidationContext within the Client field that can be used to validate the peer with which the TLS connection is being made. +to the Gateway. +* Introduce a caCertificateRefs field within ValidationContext that can be used to specify a list of CA Certificates that +can be used as a trust anchor to validate the certificates presented by the client. +* This new field is mutually exclusive with the [BackendTLSPolicy][] configuation which is used to validate the TLS certificate presented by the peer on the connection between the Gateway and the backend, and this GEP is adding support for validating the TLS certificate presented by the peer on the connection between the Gateway and the downstream client. +* Introduce an optional subjectAltNames field within ClientValidationContext that can be used to specify one or more alternate names to verify the subject identity in the certificate presented by the client. The maximum number of alternate names that can be specified is implementation defined.

I generally agree this probably makes sense to split out as a more general authz. However, there is no authz in gateway-api so its a bit tricky..

Generally any TLS usage ought to actually verify the identity in the certificates; rarely is it sufficient to just validate the issuing CA. So we would end up in a state where to properly use this feature, each vendor would need to have its own policy and users would need to use it, or be left in an insecure state.

— Reply to this email directly, view it on GitHub https://github.com/kubernetes-sigs/gateway-api/pull/2273#discussion_r1546578084, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAUR2RW6RYG5NU7ANEUM63Y3GFITAVCNFSM6AAAAAA3HR6D6OVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTSNZRGY4DOOJZGQ . You are receiving this because you were mentioned.Message ID: @.***>

costinm avatar Apr 01 '24 21:04 costinm

sorry for the force push 🙈

arkodg avatar Apr 05 '24 05:04 arkodg

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: arkodg, frankbu, robscott

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Apr 06 '24 00:04 k8s-ci-robot

I don't think release notes are needed for GEP changes, can save them until the change that actually adds the API spec.

/release-note-none

robscott avatar Apr 06 '24 00:04 robscott

/hold cancel

robscott avatar Apr 08 '24 20:04 robscott