gateway-api icon indicating copy to clipboard operation
gateway-api copied to clipboard

Route based on generic metadata matching

Open yangminzhu opened this issue 4 years ago • 11 comments

What would you like to be added:

Support routing based on a generic matadata matching mechanism.

There are many use cases for routing based various protocol or application specific attributes, for example, An API gateway may want to route the request based on validated JWT claims. A request with JWT claim env: staging is routed to the staging backend and other requests are routed to the default prod backends.

Currently the HTTPRouteMatch (and other routing rule) has the extensionRef field for implementation to extend the matching rule by referring to an external resource. However, this is not very user friendly and increases the maintenance complexity because it requires the user to split the routing rules into a separate resources.

It also does not define the format of the extension, each implementation could have completely different configuration even for the same high-level user facing feature (e.g. Route based on JWT claims), making it hard to migrate and adopt different implementions.

Why this is needed:

Currently different implementation has their own way of expressing the metadata-based routing rules, some (gloo-edge, Kong-plugin) converts the JWT claims to HTTP headers and then route based on headers, some (Nginx, HAproxy) has its own special syntax for comparing JWT claims directly and then trigger the route.

The JWT claim is just an example of the metadata that could be used for routing, it could be other attributes as well. For example, routing based on the size of a UDP packet, routing based on the TLS version, etc.

A unified generic metadata matching API from gateway-api will make it much easier for users to adopt these features among different implementations.

JWT claim routing:

The JWT-claim based routing is a common feature that has been supported in many other similar products:

  • (Solo.io) gloo-edge: "JWT Claim Based Routing" https://docs.solo.io/gloo-edge/latest/guides/security/auth/jwt/claim_routing/
  • Nginx Plus: "Authentication and Content-Based Routing with JWTs and NGINX Plus" https://www.nginx.com/blog/authentication-content-based-routing-jwts-nginx-plus/
  • Kong plugin: "JWT to Header (Route by JWT Claim)" https://docs.konghq.com/hub/yesinteractive/kong-jwt2header/
  • HAProxy: "Using HAProxy as an API Gateway" https://www.haproxy.com/blog/using-haproxy-as-an-api-gateway-part-2-authentication/
  • Envoy: https://github.com/envoyproxy/envoy/issues/17621

And it has also been asked by many customers:

  • Envoy: Route request based on JWT claims (https://github.com/envoyproxy/envoy/issues/3763)
  • Enable routing decisions based on attributes in a JWT token (https://github.com/istio/istio/issues/8444)
  • Route an Istio Virtual Service based off the user claim in a JWT (https://stackoverflow.com/questions/67730883/route-an-istio-virtual-service-based-off-the-user-claim-in-a-jwt)
  • etc.

Design Ideas:

There are potentially two ideas for supporting this

  1. Introduce special meaning keys that can be used in the header field in in HTTPRouteMatch:

      rules:
      - matches:
        - headers:
          - type: Exact
            name: $request.auth.claims[env]
            value: staging
    

    The header field uses a special key $request.auth.claims (prefix with $) to indicate its a special metadata matching and not real headers. Subsequently, a list of well-known metadata key will be defined (e.g. $request.auth.claims for JWT claims) for common use cases, Implementation will decide if they want to support the metadata. The user configuration will always remain the same.

    The above is just an example, there could be other choices of the syntax and format of the special key. One tricky thing is how this will be supported in other protocols (TCP, TLS, etc.) but the HTTP already should satisfy 90% use cases.

  2. Introduce new first-class dedicated field metadata for metadata matching:

      rules:
      - matches:
        - metadata:
          - type: Exact
            name: request.auth.claims[env]
            value: staging
    

    The metadatas is essentially a key-value pair matching structure that is the same as the headers. We will still have a list of predefined keys for common and well-known use cases like the JWT claim routing (request.auth.claims[env]), but it does not need to prefix with $ because it is already in a separate dedicated field (not reusing headers).

    We could add this field to all the match rule in other protocols (e.g. TCP, TLS).

Note both options choose to embed the metadata matching rules with the gateway-api so that the user can specify everything in the same place.

I'm looking forward to the community feedback, please let me know what you think about this feature in general, your thoughts and preferences of the 2 design ideas and any other feedback you want share, thanks.

yangminzhu avatar Oct 28 '21 20:10 yangminzhu

Thanks for raising this issue @yangminzhu! It's very helpful to see how other systems are supporting this kind of capability. I think I'd personally prefer to avoid overloading header matching here with additional concepts.

This particular use case does highlight a limitation of the API as it exists today - a lack of extension mechanism for Route matching. In v1alpha1 we had a generic extensionRef that could be used to extend Route matching, but we did not have any clear use cases or documentation for it so removed it before the v1alpha2 release.

I'm not sure what the best solution is here, but I think there are 2 different directions we can go here:

  1. Explore a built in JWT matcher + built in matches for other attributes as needed
  2. Explore a match extension mechanism that could support JWT matching and more

I think the best direction likely depends on the level of support + interest there is for these kinds of advanced matching capabilities. Interested in some feedback from other implementers.

/cc @hbagdi @jpeach @youngnick

robscott avatar Oct 29 '21 00:10 robscott

Thanks for the comment @robscott !

  1. Explore a built in JWT matcher + built in matches for other attributes as needed

I feel this might be a bit too restrictive because there could be various similar (but different) use cases in the future, like routing based on principal of a peer certificate, routing based on a special proprietary internal token (or some side channel credential) that has different format than the JWT.

If we just built it for JWT matcher only, I'm a bit worried that we might soon find we need another extension.

  1. Explore a match extension mechanism that could support JWT matching and more

I'm happy to learn other people's experiences and use cases about how should the generic match mechanism look like.

Here is just my 2 cents, I feel most use cases that I have seen is just to route based on a list of simple key-value string pair data, the semantic of the key-value pair is up to the implementation and actual scenarios. For the common scenario like JWT claims, the key is the jwt claim and value is the claim value. This of course does not represent all use cases but I feel it's good for a lot common users. It would be really great if we could define a list of common keys that can be supported across implementations, for example, the request.auth.claims[Claim-Name] for JWT claims.

yangminzhu avatar Oct 29 '21 01:10 yangminzhu

I understand what you're saying, and agree that a more generic way to do route matching would be useful.

But I think that we need to be very careful with the idea of using map[string]string fields. One of the biggest problems that arose with using annotations on Ingress objects was that it's impossible to validate the passed config and that it's difficult to ensure that everyone implements the same things in the same ways.

One of the primary goals of the Gateway API is to ensure that Routes can easily move between Gateways as far as possible. Adding a more generic extension has to be done in a way that's consistent with the API conventions, simply because those conventions are the distillation of all of the painful lessons learned by the upstream Kubernetes project in its years of existence.

This is a case where the most generic solution (that is, map[string]string fields, has been pretty conclusively proven not to work well when there are multiple parties involved.

youngnick avatar Oct 29 '21 03:10 youngnick

I think I'd personally prefer to avoid overloading header matching here with additional concepts

Yup. There was a WG discussion where there was consensus to not mandate HTTP/2 pseudo headers, so I think that metadata as headers wold reach a similar conclusion.

I like the idea of a metadata match field in HTTPRouteMatch that allows arbitrary key-value pairs. Maybe the WG can discuss this for the next API version.

jpeach avatar Oct 29 '21 05:10 jpeach

thanks for the comment everyone!

@youngnick sorry if I was not clear, I don't mean to use the literal map[string]string field but rather a structure similar to the HTTPHeaderMatch which is essentially a key, a value and a type field.

@jpeach that sounds good, we can further discuss this more in the WG if that helps. (update: I added the topic in the WG meeting note for discussions)

yangminzhu avatar Oct 29 '21 20:10 yangminzhu

Have we considered creating specific matching in extended support for JWT? It is an Internet standard with RFC specification?

bowei avatar Oct 29 '21 22:10 bowei

Have we considered creating specific matching in extended support for JWT? It is an Internet standard with RFC specification?

@bowei The JWT is standardized in RFC 7519 but I feel it's not future-proof to only support JWT in the gateway API.

The JWT claim is essentially a nested key-value map that can usually be expressed as a JSON struct, it probably makes more sense to support the underlying key-value data structure as a general solution instead of just the JWT.

For example, the following is an example JWT claim set shown in JSON format:

{
   "iss": "example.com",
   "exp": 1300819380,
   "http://example.com/is_root": true,
   "user": {
      "name": "joe",
      "email": "[email protected]",
      "group": ["engineer", "full-time"]
    }
}

The core requirement for route matching is to have the ability to look-up a property (field) in the key-value map and compare it to another value.

  • If we express the property look-up in a simple string field, it could be something like claims[user][name] and we compare it with another value (e.g. alice)

  • If we express the property look-up in a structured way, it could be something like:

    paths: ["user", "name"] # paths is a list of string, each item refers to a field in the map
    value: "alice"
    type: Equal
    

Either way, we do not need to mandate it to be JWT only, if we support matching a property in the key-value map structure, the JWT just happens to be a common use case of it, other cases could be supported easily as long as it also uses this key-value map strucutre.

yangminzhu avatar Oct 30 '21 02:10 yangminzhu

These kinds of route matching functions are not necessarily required in most cases, but it's indeed useful in some scenarios, it's my opinion that it's not appropriate to put it under the normal match.headers field, putting them together will increase the difficulties to make specific implementations to conform the Gateway API, maybe another sub-field can be introduced in the match field, so that implementations can implement the route matching features gradually.

BTW, in Apache APISIX Ingress Controller, we put these advanced route match conditions into the match.exprs field.

tokers avatar Oct 31 '21 06:10 tokers

The primary requirement here is to differentiate 'header' that user may send from 'metadata' - that is handled by trusted middle proxies or internally.

IMO, header name matching must remain close to O(1), i.e. any 'match.expr. or logic that requires evaluating an expression should not be part of header name matching. I don't think extending the 'matcher' or alghoritms for routing should be mixed with differentiating meta from headers - would be great to have a separate issue to discuss.

Matching and routing on structured header value - for example X-Forwarded-For, X-Forwarded-Client-Cert, Cookie and many other headers are not opaque strings but have structure - would be extremely useful - but I think we should also discuss it in a separate thread.

The main question I think is if we want 'metadata' to be a separate section in the API, or to use a special prefix in header section ( or both - there is broad use of 'special' headers to pass metadata ).

costinm avatar Nov 01 '21 18:11 costinm

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 30 '22 18:01 k8s-triage-robot

/lifecycle frozen

hbagdi avatar Jan 31 '22 02:01 hbagdi

While grooming we saw that this one was open for a long period of time without anyone with a strong use case to champion it. We're going to close this as we don't expect anyone's ready to drive it forward, but if you still want this feature and have a strong use case we will be happy to reconsider this and re-open.

shaneutt avatar Mar 08 '23 20:03 shaneutt