GEP: L7 Authorization Policy for Gateways and mesh workloads
The Gateway API currently lacks a standardized authorization policy, a critical requirement for production deployments, particularly as cloud infrastructure increasingly adopts zero-trust security models where granular access control is essential.
This proposal outlines an L7 Authorization Policy specifically designed for L7 Gateways and L7 workloads within a service mesh, explicitly excluding applicability to L4 Gateways or raw TCP-based workloads such as databases like MongoDB and Redis.
Here is the proposed CRD
const (
// AllGatewaysinNS applies the policy to all gateways in the same namespace.
AllGatewaysinNS GatewayApplyTo = "SameNamespace"
// TargetRefs applies the policy to the specified gateways in the targetRefs.
TargetRefs GatewayApplyTo = "TargetRefs"
)
const (
// Allow a request only if it matches the rules. This is the default type.
Allow AuthzPolicyAction = "ALLOW"
// Deny a request if it matches any of the rules.
Deny AuthzPolicyAction = "DENY"
// Custom action allows an extension to handle the user request if
// the matching rules evaluate to true.
Custom AuthzPolicyAction = "CUSTOM"
)
type AuthzPolicySpec struct {
HTTPRules []AuthPolicyHTTPRule `json:"httpRules,omitempty"`
Action *AuthzPolicyAction `json:"action,omitempty"`
// CustomProviders defines the extension providers for authorization policy.
CustomProviders *AuthzPolicyCustomProviders `json:"customProviders,omitempty"`
TargetRefs []v1.LocalObjectReference `json:"-"`
// Mesh identifies the mesh workloads to which the policy is applied.
Mesh *Mesh `json:"mesh,omitempty"`
// Gateway identifies the gateways to which the policy is applied.
Gateway *Gateway `json:"gateway,omitempty"`
}
type AuthPolicyHTTPRule struct {
// From will have client identities
From *AuthzPolicyFrom `json:"from,omitempty"`
// To will have path, host, port, headers, method
To *AuthzPolicyTo `json:"to,omitempty"`
// any valid conditional CEL expression
When *string `json:"when,omitempty"`
}
// WorkloadSelector defines the selector for the workloads to which the policy is applied.
type WorkloadSelector struct {
MatchLabels map[string]string `json:"matchLabels,omitempty"`
}
// Mesh defines the mesh workloads to which the policy is applied.
type Mesh struct {
ApplyTo *MeshApplyTo `json:"applyTo,omitempty"`
Selector *WorkloadSelector `json:"selector,omitempty"`
}
// Gateway defines the gateway workloads to which the policy is applied.
type Gateway struct {
ApplyTo *GatewayApplyTo `json:"applyTo,omitempty"`
TargetRefs []v1alpha2.LocalPolicyTargetReferenceWithSectionName `json:"targetRefs,omitempty"`
}
@robscott @LiorLieberman
Hi @aryan16.
I'm jumping in as one who's also been involved with policies and, authorisation ones in particular.
First and foremost, thanks for this proposal. Auth is a much requested feature, with a non-trivial solution, I recon.
I see in your proposal a clear pointer to GEP-1494, which I'd recommend you checking out in case you haven't already. That GEP lays out some initial use cases for auth. It covers authentication and keeps it open for authorisation as well. Even though it doesn't elaborate much on the API or the implementation yet, I believe it's a good starting point for anything that intends to be standard eventually.
Regarding the API you propose, I see a lot of Istio's AuthorizationPolicy in it. This might work fine for this particular implementation, but not sure if it'd get enough support to become standard in Gateway API IMHO.
The proposed attachment mechanisms also deviates a bit from what we currently have in GEP-713. Things like WorkloadSelector, GatewayApplyTo and AuthPolicyHTTPRule would all probably overlap, to one extend or another, with the already spec'ed targeting options. On a positive note, it feels like you're identifying a need for enhancements to those existing targeting options, which IMO could be good points to discuss in the scope of https://github.com/kubernetes-sigs/gateway-api/discussions/2927, and then perhaps some follow-up of https://github.com/kubernetes-sigs/gateway-api/pull/3609.
Thanks @guicassolato for all the details and the feedback.
Given that authorization and authentication within a service mesh target individual workloads, relying on workload-native APIs makes the most sense. Consequently, traditional attachment points like Kubernetes Services and Routes become less effective, though Routes might still play a role in specific scenarios. The existing trust and adoption of Pod labels for workload identification provide a more aligned and robust approach for defining AuthZ/Authn policies imo.
I agree that this API, particularly its rule structure, bears a resemblance to Istio's AuthorizationPolicy. This similarity stems from the power of Istio's single API to handle diverse conditions, offering a unified and user-friendly approach to authorization. However, a key difference lies in the scope: while Istio's API can become complex when applied to TCP workloads (due to the mixing of L7 and L4 rules), the proposed API's focus solely on L7 simplifies policy management for these scenarios.
And WorkloadSelector, GatewayApplyTo, MeshApplyTo are required to make a clear separation between mesh and gateways which current Gateway API lacks and very confusing for an end user. I can add more details about that.
Thanks for this proposal @aryan16, but as @guicassolato says, I'd encourage you to read the provisional version of GEP-1494 and recast this in those terms. Also please remember that anything that gets implemented needs to be implementable in dataplanes other than Envoy, so it's important to ensure that we either don't use Envoy-specific constructs, or we define a way to make those Envoy-specific constructs more standard.
Can we also show a path for route attachments? Should route, gateway, workload selectors, namespace all be part of the spec? or should only a subset of attachment points be part of the spec and the rest of them are vendor specific (so the attachment points should be extensible as well) ?
A key mistake in Istio's API that this copies is the ability to only match an entire Gateway. This leads to people recreating route matches (often incorrectly which is a security vuln) when they really want to apply the policy to a route
A key mistake in Istio's API that this copies is the ability to only match an entire Gateway. This leads to people recreating route matches (often incorrectly which is a security vuln) when they really want to apply the policy to a route
couldn't istio support attachment to a route (and potentially sectionName)?
A key mistake in Istio's API that this copies is the ability to only match an entire Gateway. This leads to people recreating route matches (often incorrectly which is a security vuln) when they really want to apply the policy to a route
The goal of this API is to support both Gateway and mesh use cases basically somehow targeting the AuthzPolicy to (Gateways, Listeners, Workloads) . We can think about extending this to support to route attachments as well. But I don't think we should just limit to route attachments (we should have a way to define the routing rules in AuthzPolicy itself as well) because of the following scenarios -
-
Users may have multiple routes with different set of routing rules for the same svc (example
Route 1 for path1 with some headerMatcher for method:POST,Route2 for path2 with some queryParams for method:POST). Now if they have a requirement to add an ALLOW policy against (identities(ID1) and method(POST)), they need to somehow attach all the routes for a svc to an AuthzPolicy. And if they create new routes in future for the same svc, they need to always update their AuthzPolicies which imo is not a good UX. So the crux of this argument is, users may have a requirement for their AuthzPolicies to include subset of common routing rules across all the Routes and direct route attachement may make things complex. -
Routes target K8s svc whereas Authz is a pure workload policy. Multiple Services can point to the same pod. If a user applies the policy on route that targets a svc, other services not part of that route may also get affected as they may be selecting the same pod (as Authz config will be configured for the workload not the svc).
-
Users may have NS/Mesh wide requirement on defining authzPolicy with some L7 routing attributes along with identities and routes always target svcs. So to achieve this for ns/mesh, they need to attach to all the routes in the ns/mesh.
I'm not saying we should only allow attaching to routes, just that it is a common and important use case. BTW the service vs workload is only a sidecar mesh thing - it doesn't apply to Gateway (or ambient architecture)
I agree that route attachment is important, will update the spec to support that. But my point was we shouldn’t remove routing rules from authzPolicy spec for other targets as you mentioned (”people recreating route matches (often incorrectly which is a security vuln) when they really want to apply the policy to a route”)
And I believe the expectations from gateway APIs is to work with sidecar based mesh as well, right?
Some prior art of authorisation policy kinds that support attaching to Gateway and HTTPRouter just in case:
- Envoy Gateway's
SecurityPolicy - Kuadrant's
AuthPolicy
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale /lifecycle frozen
I just bumped into this, and thought I'd pass it along: https://authzen-interop.net/. Could be interesting to support in the gateway api