gateway
gateway copied to clipboard
api: Authorization API design
What type of PR is this?
api: Authorization API design
Related: https://github.com/envoyproxy/gateway/issues/2462 https://github.com/envoyproxy/gateway/issues/2250
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 67.12%. Comparing base (
29946b0) to head (3293ec2). Report is 153 commits behind head on main.
:exclamation: Current head 3293ec2 differs from pull request most recent head 387951b. Consider uploading reports for the commit 387951b to get more accurate results
Additional details and impacted files
@@ Coverage Diff @@
## main #2652 +/- ##
==========================================
+ Coverage 66.51% 67.12% +0.61%
==========================================
Files 161 164 +3
Lines 22673 23803 +1130
==========================================
+ Hits 15080 15978 +898
- Misses 6720 6906 +186
- Partials 873 919 +46
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@zhaohuabing yep, I was thinking to take that issue as next. However, in http filter level ACL is going to be handled before JWT filter. So there must be 2 different RBAC filters because of that? Otherwise it is not possible to make RBAC filter based on JWT (or other auth) results. At least I do not like the way that ACL is also handled after authentication etc. That will expose all auth providers to everyone and they can try to hack things before their request is finally denied by ACL filter.
So my idea was to do two separated filters: acl (executed as first in filter chain) and authorisation(not sure about the name). Authorisation will be executed after auth plugins (like jwt + ext auth..etc). In theory we could do ACL things in same filter. However, it will make auth plugins and others open for everybody and acl idea is block these kind of requests?
btw I am not familiar with envoyproxy itself, so if my idea about filter chains are incorrect please enlight me.
@envoyproxy/gateway-maintainers before implementing anything, could others give opinion also about the format in https://github.com/envoyproxy/gateway/pull/2652#discussion_r1495014346 (usecases explained in previous post)
@arkodg @zhaohuabing updated the PR according the ideas.. but still not sure is everything like should.
@arkodg I did some tests with the api spec. The authorization commit can be seen in https://github.com/zetaab/gateway/commit/65885b551ef53ab1ff5cd24f032bdd9ecab4d762
-
other rbac stuff (like jwt authorization) can be implemented under clientselectors and https://github.com/zetaab/gateway/commit/65885b551ef53ab1ff5cd24f032bdd9ecab4d762#diff-39171fec9cb0b38b175c0b1d846990f95dd907518d1d7757857aaa7bca5ea528R109
-
there is perhaps need to define "permissions" as well later. https://github.com/zetaab/gateway/commit/65885b551ef53ab1ff5cd24f032bdd9ecab4d762#diff-39171fec9cb0b38b175c0b1d846990f95dd907518d1d7757857aaa7bca5ea528R129 but its not mandatory (yet) for ip allow/deny apis. So as I see it: this PR could proceed and I could start with implementing ip deny/allow part.
Then cases that I tested with this (ips are modified little bit, not real ips for me):
case 1:
authorization:
rules:
- clientSelector:
- clientCIDR:
- 44.94.0.0/16
- 44.65.0.0/16
action: Allow
result: both cidrs are allowed and other traffic outside of these are denied.
case 2:
authorization:
rules:
- clientSelector:
- clientCIDR:
- 44.94.0.0/16
- clientCIDR:
- 44.65.0.0/16
action: Allow
result: both cidrs are allowed and other traffic outside of these are denied.
case 3:
authorization:
rules:
- clientSelector:
- clientCIDR:
- 44.94.0.0/16
action: Allow
- clientSelector:
- clientCIDR:
- 44.65.0.0/16
action: Allow
result: none of the CIDRs are working. (difference to case 2 is that instead of having two objects of clientCIDR under same clientselector, this use separated clientselectors. Which is creating two RBAC filters (instead of one that was the case with 1 and 2) in envoy level with the example commit).
case 4:
authorization:
rules:
- clientSelector:
- clientCIDR:
- 44.94.0.0/16
- 44.65.0.0/16
action: Allow
- clientSelector:
- clientCIDR:
- 44.94.107.109/32
action: Log
- clientSelector:
- clientCIDR:
- 44.94.107.109/32
action: Deny
result: both cidrs are allowed and the specific ip is denied. Requests from that single ip is logged.
as we can see multiple allow rules (multiple envoy rbac filters, instead of single) might be interesting (case 3), not sure is there really use-case for that. Maybe there is if we combine other components like user information. In case of IPs it might be not that useful.
thanks for running the tests !
- does Case 4 imply "most specific match wins" ? (you can try to reverse order and test to ensure its not last match or first match that wins)
- also weird Case 3 not working is a little weird, I wouldn't block the API on it, because it should be solvable using other methods
this API is pretty close to the finish line, hoping the output of case 4 can be made into a doc string in the API to add more clarity for the user
case 5
authorization:
rules:
- clientSelector:
- clientCIDR:
- 44.94.107.109/32
action: Deny
result: single ip address is denied
case 6:
authorization:
rules:
- clientSelector:
- clientCIDR:
- 44.94.107.109/32
action: Deny
- clientSelector:
- clientCIDR:
- 44.94.0.0/16
- 44.65.0.0/16
action: Allow
result: single ip address is still denied (but actually that allow rule does not do anything, also other cidrs outside that is allowed)
case 7:
authorization:
rules:
- clientSelector:
- clientCIDR:
- 44.94.0.0/16
action: Deny
- clientSelector:
- clientCIDR:
- 44.94.107.109/32
- 44.65.0.0/16
action: Allow
result: 44.65.0.0/16 cidr works, but nothing works from 44.94.0.0/16 cidr. So deny is somehow overriding overlapping allow rules instead of "more specific match".
@arkodg I would like to move to coding phase in case of this API. This PR now contains more than 40 reviews, kind of saying: how much more is still needed? Could for instance some of the other maintainers give feedback about this PR and then we could somehow merge this soonish?
@arkodg I would like to move to coding phase in case of this API. This PR now contains more than 40 reviews, kind of saying: how much more is still needed? Could for instance some of the other maintainers give feedback about this PR and then we could somehow merge this soonish?
if we moved the clientSelector to a singular field, and defined first rule wins semantics i.e. first rule has higher precedence over next one, I'd have no other concerns.
This API and implementation, will unfortunately not be able to make it into v1.0.0 since we will be releasing v1.0.0 RC early next week, so we'll only be able to merge this after v1.0.0 is released.
@envoyproxy/gateway-maintainers @envoyproxy/gateway-reviewers please review, and share any feedback
@arkodg modified the field now, as 1.0.0 is released I could continue working on this?
thanks for kick starting this PR again @zetaab !
the open question left for me is the match semantics - I find the current definition is the API confusing, which is why I recommended for first match wins with a default Deny. I'll bring this up in the community meeting on Thursday, so others can weigh in as well
This came up in the community meeting today. I'm in support of first match wins with a default Deny design as it preserves an important level of flexibility as described here: https://github.com/envoyproxy/gateway/pull/2652#discussion_r1506954516
It also does still allow for the inverse where the auth is allow by default by placing a catch all Allow rule as the last rule with Deny rules preceding it. For example:
authorization:
rules:
- clientSelectors:
clientCIDRs:
- 123.123.0.0/24
action: Deny
- clientSelectors:
clientCIDRs:
- 0.0.0.0/0
action: Allow
This would allow us to
- deny from IPs falling in 123.123.0.0/24
- allow others (the built-in default-deny would never come in to play since the last rule allows everything)
hey @zetaab still working on this one ?
@arkodg well not really, I have tried to get it done but always something is missing. So I have kind of lost energy to do anything. If someone want to continue - feel free.
sorry you feel this way, thanks for all the work you've put into this APi, imo it's close to the finish line, I'll try and take your work forward in a follow PR
I'm considering having two separate APIs—one for ACL and one for authorization.
Rationale:
- Typically, we don't need fine-grained access control(method, header, etc.) based on IP addresses; simply allowing or denying access is sufficient.
- It seems awkward to combine IP addresses and client identities (JWT tokens, client certs) within the same API.
- They should be enforced in different phases, as @zetaab pointed out. So having them within the same structure may confuse users.
Apologies for the churn. This takes too long already :-)
IPs could be useful together with jwts. For instance, if I have /admin path. I want allow it only for admins which are coming from cidr x
@zhaohuabing I don't think it's awkward. one use case is in our company where we use mTLS & IP restriction or key auth with IP restriction.
This is to archive high-security implementation in case client cert / key auth / JWT is compromised, the attacker needs to access the client infrastructure instead of using their local network.
@arkodg @zhaohuabing is there something that we could now do to get progress to this issue? I would really like to see this authorization feature. IMO it should be started from the ip addresses, then add more features like jwt
@arkodg @zhaohuabing please review. I have updated the PR according the spec
Examples
allow all subjects to GET method
rules:
- permissions: [GET]
action: "Allow"
allow 1.0.0.0/8 and 91.1.1.1/8 to all permissions (will deny others)
rules:
- subjects:
- clientCIDR: 1.0.0.0/8
- clientCIDR: 91.1.1.1/8
action: "Allow"
block all requests
rules:
- action: "Deny"
I added clientCIDR only as an example (and I am thinking to start implementing that after api spec is accepted). So we need to add more subject parameters later.
@arkodg your comments are not down the line what was talked earlier in comments. So I do not know what to do here. If I do changes that you asked, is @zhaohuabing going to accept those.
Anyway as I see it, the biggest issue with the spec is that we should try implementing cases and after that we can see does it really work and support all the needed use cases. So I would like to get started with something. Should we do this in same PR? Like implement cidr support? It might mean breaking api spec if we merge this and try implementing items in another PR..
hey @zetaab completely understand your point, I'll try and prioritize this issue in the community meeting this week to see if we can make more progress on it, and will report back here
Should we align the Authorization API design with the Envoy RBAC filter configuration(move Action out of Rule to a higher level in the structure) for ease of implementation?
For example, this EG Authorization config:
authorization:
action: ALLOW
rules:
- name: "service-admin"
subjects:
- clientCert: "cluster.local/ns/default/sa/admin"
- clientCert: "cluster.local/ns/default/sa/superuser"
- name: "product-viewer"
permissions:
- "GET"
can be translated to the below Envoy Gateway RBAC filter:
- name: envoy.filters.http.rbac
config:
rules:
action: ALLOW
policies:
"service-admin":
permissions:
- any: true
principals:
- authenticated:
principal_name:
exact: "cluster.local/ns/default/sa/admin"
- authenticated:
principal_name:
exact: "cluster.local/ns/default/sa/superuser"
"product-viewer":
permissions:
- header:
name: ":method"
string_match:
exact: "GET"
principals:
- any: true
We may also need to add another layer on top of the above API, so we can have a more flexible authz logic like "allowing authenticated users access but excluding all users from 10.75.1.0/24"
EG Authorization config:
authorization:
- action: ALLOW
pocies:
- name: "service-admin"
subjects:
- clientCert: "cluster.local/ns/default/sa/admin"
- clientCert: "cluster.local/ns/default/sa/superuser"
- name: "product-viewer"
permissions:
- "GET"
- action: DENY
policies:
- name: "all-users-in-a-specific-cidr"
subjects:
- cidr: "10.75.1.0/24"
The above configuration can be implemented with multiple RBAC filters:
- name: envoy.filters.http.rbac
config:
rules:
action: DENY
policies:
"deny-a-specific-cidr":
permissions:
- any: true
principals:
- remote_ip: "1.2.3.0/24"
- name: envoy.filters.http.rbac
config:
rules:
action: ALLOW
policies:
"service-admin":
permissions:
- any: true
principals:
- authenticated:
principal_name:
exact: "cluster.local/ns/default/sa/admin"
- authenticated:
principal_name:
exact: "cluster.local/ns/default/sa/superuser"
"product-viewer":
permissions:
- header:
name: ":method"
string_match:
exact: "GET"
principals:
- any: true
To conclude, the Authorization API will look like:
type Authorization struct {
Rules []Rule `json:"rules,omitempty"`
}
type Rule struct {
Action RuleActionType `json:"action"`
Policies []Policy `json:"rules,omitempty"`
}
type Policy struct {
Subjects []Subject `json:"subjects,omitempty"`
Permissions []string `json:"permissions,omitempty"`
}
Does this make sense? @arkodg @zetaab
Should we align the Authorization API design with the Envoy RBAC filter configuration(move
Actionout ofRuleto a higher level in the structure) for ease of implementation?For example, this EG Authorization config:
authorization: action: ALLOW rules: - name: "service-admin" subjects: - clientCert: "cluster.local/ns/default/sa/admin" - clientCert: "cluster.local/ns/default/sa/superuser" - name: "product-viewer" permissions: - "GET"can be translated to the below Envoy Gateway RBAC filter:
- name: envoy.filters.http.rbac config: rules: action: ALLOW policies: "service-admin": permissions: - any: true principals: - authenticated: principal_name: exact: "cluster.local/ns/default/sa/admin" - authenticated: principal_name: exact: "cluster.local/ns/default/sa/superuser" "product-viewer": permissions: - header: name: ":method" string_match: exact: "GET" principals: - any: trueWe may also need to add another layer on top of the above API, so we can have a more flexible authz logic like "allowing authenticated users access but excluding all users from 10.75.1.0/24"
EG Authorization config:
authorization: - action: ALLOW pocies: - name: "service-admin" subjects: - clientCert: "cluster.local/ns/default/sa/admin" - clientCert: "cluster.local/ns/default/sa/superuser" - name: "product-viewer" permissions: - "GET" - action: DENY policies: - name: "all-users-in-a-specific-cidr" subjects: - cidr: "10.75.1.0/24"The above configuration can be implemented with multiple RBAC filters:
- name: envoy.filters.http.rbac config: rules: action: DENY policies: "deny-a-specific-cidr": permissions: - any: true principals: - remote_ip: "1.2.3.0/24" - name: envoy.filters.http.rbac config: rules: action: ALLOW policies: "service-admin": permissions: - any: true principals: - authenticated: principal_name: exact: "cluster.local/ns/default/sa/admin" - authenticated: principal_name: exact: "cluster.local/ns/default/sa/superuser" "product-viewer": permissions: - header: name: ":method" string_match: exact: "GET" principals: - any: trueTo conclude, the
AuthorizationAPI will look like:type Authorization struct { Rules []Rule `json:"rules,omitempty"` } type Rule struct { Action RuleActionType `json:"action"` Policies []Policy `json:"rules,omitempty"` } type Policy struct { Subjects []Subject `json:"subjects,omitempty"` Permissions []string `json:"permissions,omitempty"` }Does this make sense? @arkodg @zetaab
prefer if we take a user/usability approach first before measuring feasibility (mapping to envoy) for this API
that will look like
authorization:
rules:
- action: Allow
policies:
- subjects:
- clientCIDR: 1.1.1.1
permissions: ["GET"]
I am fine with that. At least it should support many use cases
authorization:
rules:
- action: Allow
subjects:
- clientCIDR: 1.1.1.1
permissions: ["GET"]
the current PR model is not that "heavy", but there might be an issue that it does not support all the use cases
Following our discussion in this week's EG meeting, I did some rearch on prior art of Gateway implementations. Here are my findings:
Istio Authorization Policy
This is the most relevant one. Istio Authz policy offers fine-grained authorization control based on host, path, and method. It supports source(subject) from cidr, workload identity, and user identity.
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: httpbin
namespace: foo
spec:
action: ALLOW
rules:
- from:
- source:
principals: ["cluster.local/ns/default/sa/sleep"]
- source:
namespaces: ["test"]
to:
- operation:
methods: ["GET"]
paths: ["/info*"]
- operation:
methods: ["POST"]
paths: ["/data"]
when:
- key: request.auth.claims[iss]
values: ["https://accounts.google.com"]
Since Istio Authz handles both the ingress and mesh traffic, from can be a principals, ipBlocks, namespaces, jwt claims. to is a list of operations(permission on what resources), and when is a list of extra condtions of requests needed to be matched.
And there is a important difference between Istio Authz Policy and EG Authz Policy: Istio can have multiple Authz policies on a specific service, but EG only supports one SecurityPolicy per xRoute, so EG Authz API structure needs to look like a list of Istio Authz policies.
APISIX supports authz on an HTTPRoure
APISIX's Authz concept is smilar to what we want to achieve in EG: authz on a sepcif HTTPRoute. It's less flexible than Istio Authz policy though.
apiVersion: apisix.apache.org/v2
kind: ApisixRoute
metadata:
name: httpserver-route
spec:
http:
- name: rule1
match:
hosts:
- httpbin.org
paths:
- /*
backends:
- serviceName: httpbin
servicePort: 80
plugins:
- name: consumer-restriction
enable: false
config:
allowed_by_methods:
- user: "default_jack1"
methods:
- "POST"
- "GET"
- user: "default_jack2"
methods:
- "GET"
Ambassador supports IP deny/allow list at the global level, without full-fleged authz policy.
apiVersion: getambassador.io/v2
kind: Module
metadata:
name: ambassador
spec:
config:
ip_deny:
- peer: 99.99.99.0
- remote: 10.16.0.0/16
Contour supports IP deny/allow on route level, without full-fleged authz policy.
apiVersion: projectcontour.io/v1
kind: HTTPProxy
metadata:
name: basic
spec:
virtualhost:
fqdn: foo-basic.bar.com
ipAllowPolicy:
# traffic is allowed if it came from localhost (i.e. co-located reverse proxy)
- cidr: 127.0.0.1/32
source: Peer
routes:
- conditions:
- prefix: /
services:
- name: s1
port: 80
# route-level ip filters override the virtualhost-level filters
ipAllowPolicy:
# traffic is allowed if it came from localhost (i.e. co-located reverse proxy)
- cidr: 127.0.0.1/32
source: Peer
# and the request originated from an IP in this range
- cidr: 99.99.0.0/16
source: Remote
Other Gateway Implementations
Other Gateway implementations normally just delegate the Authz to an external process, such as an OPA.
Nginx: https://github.com/summerwind/opa-nginx-rbac Kong: https://docs.konghq.com/hub/kong-inc/opa Gloo: https://docs.solo.io/gloo-edge/latest/guides/security/auth/extauth/opa
Of all the implementations studied, Istio Authz and APISIX Authz APIs align most closely with EG's requirements.
I've updated this PR to reflect the findings. While the current API is flexible, it may be too complex. Below is a proposal for a less complex variation that I prefer(It's just the draft for the structure, we still need to work on naming):
// Authorization defines the authorization configuration.
// The evaluation is determined by the following rules:
// 1. If a request matches any one of the Deny policy, the request gets rejected with 401
// 2. Then, if a request matches any one of the Allow rules, the request is allowed
// 3. If a request doesn't match any one of the Allow rules, the request is rejected with 401
type Authorization struct {
Deny []Policy `json:"deny,omitempty"`
Allow []Policy `json:"allow,omitempty"`
Audit []Policy `json:"log,omitempty"`
}
type Policy struct {
Subject Subject `json:"subjects,omitempty"`
Permissions []string `json:"permissions,omitempty"`
}
type Subject struct {
ClientCIDR []string `json:"clientCIDR,omitempty"`
}
@arkodg @zetaab @guydc
type Authorization struct {
Deny []Policy `json:"deny,omitempty"`
Allow []Policy `json:"allow,omitempty"`
Log []Policy `json:"log,omitempty"`
}
in which order these are going to be applied? it might be issue? or is there really need to have combination of different action types
type Authorization struct { Deny []Policy `json:"deny,omitempty"` Allow []Policy `json:"allow,omitempty"` Log []Policy `json:"log,omitempty"` }in which order these are going to be applied? it might be issue? or is there really need to have combination of different action types
The evaluation is determined by the following rules:
- If a request matches any one of the Deny policy, the request gets rejected with 401
- Then, if a request matches any one of the Allow rules, the request is allowed
- If a request doesn't match any one of the Allow rules, the request is rejected with 401
hmm it could work in that case, if its possible to implement like that
hmm it could work in that case, if its possible to implement like that
@arkodg mentioned a use case in today's meeting: allowing a small hole in a large CIDR block. This won't work for that because Deny goes first, so we need to use the other option in the PR code.