gateway icon indicating copy to clipboard operation
gateway copied to clipboard

api: Authorization API design

Open zetaab opened this issue 1 year ago • 20 comments

What type of PR is this?

api: Authorization API design

Related: https://github.com/envoyproxy/gateway/issues/2462 https://github.com/envoyproxy/gateway/issues/2250

zetaab avatar Feb 19 '24 06:02 zetaab

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 67.12%. Comparing base (29946b0) to head (3293ec2). Report is 153 commits behind head on main.

:exclamation: Current head 3293ec2 differs from pull request most recent head 387951b. Consider uploading reports for the commit 387951b to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2652      +/-   ##
==========================================
+ Coverage   66.51%   67.12%   +0.61%     
==========================================
  Files         161      164       +3     
  Lines       22673    23803    +1130     
==========================================
+ Hits        15080    15978     +898     
- Misses       6720     6906     +186     
- Partials      873      919      +46     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Feb 19 '24 06:02 codecov[bot]

@zhaohuabing yep, I was thinking to take that issue as next. However, in http filter level ACL is going to be handled before JWT filter. So there must be 2 different RBAC filters because of that? Otherwise it is not possible to make RBAC filter based on JWT (or other auth) results. At least I do not like the way that ACL is also handled after authentication etc. That will expose all auth providers to everyone and they can try to hack things before their request is finally denied by ACL filter.

So my idea was to do two separated filters: acl (executed as first in filter chain) and authorisation(not sure about the name). Authorisation will be executed after auth plugins (like jwt + ext auth..etc). In theory we could do ACL things in same filter. However, it will make auth plugins and others open for everybody and acl idea is block these kind of requests?

btw I am not familiar with envoyproxy itself, so if my idea about filter chains are incorrect please enlight me.

zetaab avatar Feb 19 '24 15:02 zetaab

@envoyproxy/gateway-maintainers before implementing anything, could others give opinion also about the format in https://github.com/envoyproxy/gateway/pull/2652#discussion_r1495014346 (usecases explained in previous post)

zetaab avatar Feb 20 '24 06:02 zetaab

@arkodg @zhaohuabing updated the PR according the ideas.. but still not sure is everything like should.

zetaab avatar Feb 20 '24 19:02 zetaab

@arkodg I did some tests with the api spec. The authorization commit can be seen in https://github.com/zetaab/gateway/commit/65885b551ef53ab1ff5cd24f032bdd9ecab4d762

  1. other rbac stuff (like jwt authorization) can be implemented under clientselectors and https://github.com/zetaab/gateway/commit/65885b551ef53ab1ff5cd24f032bdd9ecab4d762#diff-39171fec9cb0b38b175c0b1d846990f95dd907518d1d7757857aaa7bca5ea528R109

  2. there is perhaps need to define "permissions" as well later. https://github.com/zetaab/gateway/commit/65885b551ef53ab1ff5cd24f032bdd9ecab4d762#diff-39171fec9cb0b38b175c0b1d846990f95dd907518d1d7757857aaa7bca5ea528R129 but its not mandatory (yet) for ip allow/deny apis. So as I see it: this PR could proceed and I could start with implementing ip deny/allow part.

Then cases that I tested with this (ips are modified little bit, not real ips for me):

case 1:

  authorization:
    rules:
    - clientSelector:
        - clientCIDR:
          - 44.94.0.0/16
          - 44.65.0.0/16
      action: Allow

result: both cidrs are allowed and other traffic outside of these are denied.

case 2:

  authorization:
    rules:
    - clientSelector:
        - clientCIDR:
          - 44.94.0.0/16
        - clientCIDR:
          - 44.65.0.0/16
      action: Allow

result: both cidrs are allowed and other traffic outside of these are denied.

case 3:

  authorization:
    rules:
    - clientSelector:
        - clientCIDR:
          - 44.94.0.0/16
      action: Allow
    - clientSelector:
        - clientCIDR:
          - 44.65.0.0/16
      action: Allow

result: none of the CIDRs are working. (difference to case 2 is that instead of having two objects of clientCIDR under same clientselector, this use separated clientselectors. Which is creating two RBAC filters (instead of one that was the case with 1 and 2) in envoy level with the example commit).

case 4:

  authorization:
    rules:
    - clientSelector:
        - clientCIDR:
          - 44.94.0.0/16
          - 44.65.0.0/16
      action: Allow
    - clientSelector:
        - clientCIDR:
          - 44.94.107.109/32
      action: Log
    - clientSelector:
        - clientCIDR:
          - 44.94.107.109/32
      action: Deny

result: both cidrs are allowed and the specific ip is denied. Requests from that single ip is logged.

as we can see multiple allow rules (multiple envoy rbac filters, instead of single) might be interesting (case 3), not sure is there really use-case for that. Maybe there is if we combine other components like user information. In case of IPs it might be not that useful.

zetaab avatar Feb 24 '24 16:02 zetaab

thanks for running the tests !

  • does Case 4 imply "most specific match wins" ? (you can try to reverse order and test to ensure its not last match or first match that wins)
  • also weird Case 3 not working is a little weird, I wouldn't block the API on it, because it should be solvable using other methods

this API is pretty close to the finish line, hoping the output of case 4 can be made into a doc string in the API to add more clarity for the user

arkodg avatar Feb 27 '24 00:02 arkodg

case 5

  authorization:
    rules:
    - clientSelector:
        - clientCIDR:
          - 44.94.107.109/32
      action: Deny

result: single ip address is denied

case 6:

  authorization:
    rules:
    - clientSelector:
        - clientCIDR:
          - 44.94.107.109/32
      action: Deny
    - clientSelector:
        - clientCIDR:
          - 44.94.0.0/16
          - 44.65.0.0/16
      action: Allow

result: single ip address is still denied (but actually that allow rule does not do anything, also other cidrs outside that is allowed)

case 7:

  authorization:
    rules:
    - clientSelector:
        - clientCIDR:
          - 44.94.0.0/16
      action: Deny
    - clientSelector:
        - clientCIDR:
          - 44.94.107.109/32
          - 44.65.0.0/16
      action: Allow

result: 44.65.0.0/16 cidr works, but nothing works from 44.94.0.0/16 cidr. So deny is somehow overriding overlapping allow rules instead of "more specific match".

zetaab avatar Feb 27 '24 06:02 zetaab

@arkodg I would like to move to coding phase in case of this API. This PR now contains more than 40 reviews, kind of saying: how much more is still needed? Could for instance some of the other maintainers give feedback about this PR and then we could somehow merge this soonish?

zetaab avatar Mar 01 '24 13:03 zetaab

@arkodg I would like to move to coding phase in case of this API. This PR now contains more than 40 reviews, kind of saying: how much more is still needed? Could for instance some of the other maintainers give feedback about this PR and then we could somehow merge this soonish?

if we moved the clientSelector to a singular field, and defined first rule wins semantics i.e. first rule has higher precedence over next one, I'd have no other concerns.

This API and implementation, will unfortunately not be able to make it into v1.0.0 since we will be releasing v1.0.0 RC early next week, so we'll only be able to merge this after v1.0.0 is released.

@envoyproxy/gateway-maintainers @envoyproxy/gateway-reviewers please review, and share any feedback

arkodg avatar Mar 02 '24 01:03 arkodg

@arkodg modified the field now, as 1.0.0 is released I could continue working on this?

zetaab avatar Mar 13 '24 19:03 zetaab

thanks for kick starting this PR again @zetaab ! the open question left for me is the match semantics - I find the current definition is the API confusing, which is why I recommended for first match wins with a default Deny. I'll bring this up in the community meeting on Thursday, so others can weigh in as well

arkodg avatar Mar 13 '24 21:03 arkodg

This came up in the community meeting today. I'm in support of first match wins with a default Deny design as it preserves an important level of flexibility as described here: https://github.com/envoyproxy/gateway/pull/2652#discussion_r1506954516

It also does still allow for the inverse where the auth is allow by default by placing a catch all Allow rule as the last rule with Deny rules preceding it. For example:

authorization:
  rules:
  - clientSelectors:
       clientCIDRs:
        - 123.123.0.0/24
     action: Deny
  - clientSelectors:
       clientCIDRs:
        - 0.0.0.0/0
     action: Allow

This would allow us to

  • deny from IPs falling in 123.123.0.0/24
  • allow others (the built-in default-deny would never come in to play since the last rule allows everything)

davidalger avatar Mar 14 '24 17:03 davidalger

hey @zetaab still working on this one ?

arkodg avatar Apr 05 '24 10:04 arkodg

@arkodg well not really, I have tried to get it done but always something is missing. So I have kind of lost energy to do anything. If someone want to continue - feel free.

zetaab avatar Apr 05 '24 12:04 zetaab

sorry you feel this way, thanks for all the work you've put into this APi, imo it's close to the finish line, I'll try and take your work forward in a follow PR

arkodg avatar Apr 05 '24 12:04 arkodg

I'm considering having two separate APIs—one for ACL and one for authorization.

Rationale:

  • Typically, we don't need fine-grained access control(method, header, etc.) based on IP addresses; simply allowing or denying access is sufficient.
  • It seems awkward to combine IP addresses and client identities (JWT tokens, client certs) within the same API.
  • They should be enforced in different phases, as @zetaab pointed out. So having them within the same structure may confuse users.

Apologies for the churn. This takes too long already :-)

zhaohuabing avatar Apr 12 '24 07:04 zhaohuabing

IPs could be useful together with jwts. For instance, if I have /admin path. I want allow it only for admins which are coming from cidr x

zetaab avatar Apr 12 '24 09:04 zetaab

@zhaohuabing I don't think it's awkward. one use case is in our company where we use mTLS & IP restriction or key auth with IP restriction.

This is to archive high-security implementation in case client cert / key auth / JWT is compromised, the attacker needs to access the client infrastructure instead of using their local network.

zufardhiyaulhaq avatar Apr 20 '24 12:04 zufardhiyaulhaq

@arkodg @zhaohuabing is there something that we could now do to get progress to this issue? I would really like to see this authorization feature. IMO it should be started from the ip addresses, then add more features like jwt

zetaab avatar Apr 26 '24 21:04 zetaab

@arkodg @zhaohuabing please review. I have updated the PR according the spec

Examples

allow all subjects to GET method

rules:
- permissions: [GET]
  action: "Allow"

allow 1.0.0.0/8 and 91.1.1.1/8 to all permissions (will deny others)

rules:
- subjects:
  - clientCIDR: 1.0.0.0/8
  - clientCIDR: 91.1.1.1/8
  action: "Allow"

block all requests

rules:
- action: "Deny"

I added clientCIDR only as an example (and I am thinking to start implementing that after api spec is accepted). So we need to add more subject parameters later.

zetaab avatar May 02 '24 15:05 zetaab

@arkodg your comments are not down the line what was talked earlier in comments. So I do not know what to do here. If I do changes that you asked, is @zhaohuabing going to accept those.

Anyway as I see it, the biggest issue with the spec is that we should try implementing cases and after that we can see does it really work and support all the needed use cases. So I would like to get started with something. Should we do this in same PR? Like implement cidr support? It might mean breaking api spec if we merge this and try implementing items in another PR..

zetaab avatar May 06 '24 09:05 zetaab

hey @zetaab completely understand your point, I'll try and prioritize this issue in the community meeting this week to see if we can make more progress on it, and will report back here

arkodg avatar May 07 '24 19:05 arkodg

Should we align the Authorization API design with the Envoy RBAC filter configuration(move Action out of Rule to a higher level in the structure) for ease of implementation?

For example, this EG Authorization config:

  authorization:
    action: ALLOW
    rules: 
      - name: "service-admin"
        subjects:
          - clientCert: "cluster.local/ns/default/sa/admin"
          - clientCert: "cluster.local/ns/default/sa/superuser"
      - name: "product-viewer"
        permissions:
          - "GET"

can be translated to the below Envoy Gateway RBAC filter:

  - name: envoy.filters.http.rbac
    config:
      rules:
        action: ALLOW
        policies:
          "service-admin":
            permissions:
              - any: true
            principals:
              - authenticated:
                  principal_name:
                    exact: "cluster.local/ns/default/sa/admin"
              - authenticated:
                  principal_name:
                    exact: "cluster.local/ns/default/sa/superuser"

          "product-viewer":
            permissions:
                - header:
                    name: ":method"
                    string_match:
                      exact: "GET"
            principals:
              - any: true

We may also need to add another layer on top of the above API, so we can have a more flexible authz logic like "allowing authenticated users access but excluding all users from 10.75.1.0/24"

EG Authorization config:

  authorization:
    - action: ALLOW
      pocies: 
        - name: "service-admin"
          subjects:
            - clientCert: "cluster.local/ns/default/sa/admin"
            - clientCert: "cluster.local/ns/default/sa/superuser"
        - name: "product-viewer"
          permissions:
            - "GET"
    - action: DENY
      policies: 
        - name: "all-users-in-a-specific-cidr"
          subjects:
            - cidr: "10.75.1.0/24"

The above configuration can be implemented with multiple RBAC filters:

  - name: envoy.filters.http.rbac
    config:
      rules:
        action: DENY
        policies:
          "deny-a-specific-cidr":
            permissions:
              - any: true
            principals:
              - remote_ip: "1.2.3.0/24"  
            
   - name: envoy.filters.http.rbac
    config:
      rules:
        action: ALLOW
        policies:
          "service-admin":
            permissions:
              - any: true
            principals:
              - authenticated:
                  principal_name:
                    exact: "cluster.local/ns/default/sa/admin"
              - authenticated:
                  principal_name:
                    exact: "cluster.local/ns/default/sa/superuser"

          "product-viewer":
            permissions:
                - header:
                    name: ":method"
                    string_match:
                      exact: "GET"
            principals:
              - any: true

To conclude, the Authorization API will look like:

type Authorization struct {
  Rules []Rule `json:"rules,omitempty"`
}

type Rule struct {
  Action RuleActionType `json:"action"`

  Policies []Policy `json:"rules,omitempty"`
}

type Policy struct {
  Subjects []Subject `json:"subjects,omitempty"`

  Permissions []string `json:"permissions,omitempty"`
}

Does this make sense? @arkodg @zetaab

zhaohuabing avatar May 07 '24 20:05 zhaohuabing

Should we align the Authorization API design with the Envoy RBAC filter configuration(move Action out of Rule to a higher level in the structure) for ease of implementation?

For example, this EG Authorization config:

  authorization:
    action: ALLOW
    rules: 
      - name: "service-admin"
        subjects:
          - clientCert: "cluster.local/ns/default/sa/admin"
          - clientCert: "cluster.local/ns/default/sa/superuser"
      - name: "product-viewer"
        permissions:
          - "GET"

can be translated to the below Envoy Gateway RBAC filter:

  - name: envoy.filters.http.rbac
    config:
      rules:
        action: ALLOW
        policies:
          "service-admin":
            permissions:
              - any: true
            principals:
              - authenticated:
                  principal_name:
                    exact: "cluster.local/ns/default/sa/admin"
              - authenticated:
                  principal_name:
                    exact: "cluster.local/ns/default/sa/superuser"

          "product-viewer":
            permissions:
                - header:
                    name: ":method"
                    string_match:
                      exact: "GET"
            principals:
              - any: true

We may also need to add another layer on top of the above API, so we can have a more flexible authz logic like "allowing authenticated users access but excluding all users from 10.75.1.0/24"

EG Authorization config:

  authorization:
    - action: ALLOW
      pocies: 
        - name: "service-admin"
          subjects:
            - clientCert: "cluster.local/ns/default/sa/admin"
            - clientCert: "cluster.local/ns/default/sa/superuser"
        - name: "product-viewer"
          permissions:
            - "GET"
    - action: DENY
      policies: 
        - name: "all-users-in-a-specific-cidr"
          subjects:
            - cidr: "10.75.1.0/24"

The above configuration can be implemented with multiple RBAC filters:

  - name: envoy.filters.http.rbac
    config:
      rules:
        action: DENY
        policies:
          "deny-a-specific-cidr":
            permissions:
              - any: true
            principals:
              - remote_ip: "1.2.3.0/24"  
            
   - name: envoy.filters.http.rbac
    config:
      rules:
        action: ALLOW
        policies:
          "service-admin":
            permissions:
              - any: true
            principals:
              - authenticated:
                  principal_name:
                    exact: "cluster.local/ns/default/sa/admin"
              - authenticated:
                  principal_name:
                    exact: "cluster.local/ns/default/sa/superuser"

          "product-viewer":
            permissions:
                - header:
                    name: ":method"
                    string_match:
                      exact: "GET"
            principals:
              - any: true

To conclude, the Authorization API will look like:

type Authorization struct {
  Rules []Rule `json:"rules,omitempty"`
}

type Rule struct {
  Action RuleActionType `json:"action"`

  Policies []Policy `json:"rules,omitempty"`
}

type Policy struct {
  Subjects []Subject `json:"subjects,omitempty"`

  Permissions []string `json:"permissions,omitempty"`
}

Does this make sense? @arkodg @zetaab

prefer if we take a user/usability approach first before measuring feasibility (mapping to envoy) for this API

arkodg avatar May 07 '24 21:05 arkodg

that will look like

authorization:
  rules:
  - action: Allow
    policies:
    - subjects:
      - clientCIDR: 1.1.1.1
      permissions: ["GET"]

I am fine with that. At least it should support many use cases

authorization:
  rules:
  - action: Allow
    subjects:
    - clientCIDR: 1.1.1.1
    permissions: ["GET"]

the current PR model is not that "heavy", but there might be an issue that it does not support all the use cases

zetaab avatar May 08 '24 05:05 zetaab

Following our discussion in this week's EG meeting, I did some rearch on prior art of Gateway implementations. Here are my findings:

Istio Authorization Policy

This is the most relevant one. Istio Authz policy offers fine-grained authorization control based on host, path, and method. It supports source(subject) from cidr, workload identity, and user identity.

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: httpbin
  namespace: foo
spec:
  action: ALLOW
  rules:
  - from:
    - source:
        principals: ["cluster.local/ns/default/sa/sleep"]
    - source:
        namespaces: ["test"]
    to:
    - operation:
        methods: ["GET"]
        paths: ["/info*"]
    - operation:
        methods: ["POST"]
        paths: ["/data"]
    when:
    - key: request.auth.claims[iss]
      values: ["https://accounts.google.com"]

Since Istio Authz handles both the ingress and mesh traffic, from can be a principals, ipBlocks, namespaces, jwt claims. to is a list of operations(permission on what resources), and when is a list of extra condtions of requests needed to be matched.

And there is a important difference between Istio Authz Policy and EG Authz Policy: Istio can have multiple Authz policies on a specific service, but EG only supports one SecurityPolicy per xRoute, so EG Authz API structure needs to look like a list of Istio Authz policies.

APISIX supports authz on an HTTPRoure

APISIX's Authz concept is smilar to what we want to achieve in EG: authz on a sepcif HTTPRoute. It's less flexible than Istio Authz policy though.

apiVersion: apisix.apache.org/v2
kind: ApisixRoute
metadata:
 name: httpserver-route
spec:
 http:
 - name: rule1
   match:
     hosts:
     - httpbin.org
     paths:
       - /*
   backends:
   - serviceName: httpbin
     servicePort: 80
   plugins:
   - name: consumer-restriction
     enable: false
     config:
       allowed_by_methods:
       - user: "default_jack1"
         methods:
         - "POST"
         - "GET"
       - user: "default_jack2"
         methods:
         - "GET"

Ambassador supports IP deny/allow list at the global level, without full-fleged authz policy.

apiVersion: getambassador.io/v2
kind: Module
metadata:
  name: ambassador
spec:
  config:
    ip_deny:
    - peer: 99.99.99.0
    - remote: 10.16.0.0/16

Contour supports IP deny/allow on route level, without full-fleged authz policy.

apiVersion: projectcontour.io/v1
kind: HTTPProxy
metadata:
  name: basic
spec:
  virtualhost:
    fqdn: foo-basic.bar.com
    ipAllowPolicy:
      # traffic is allowed if it came from localhost (i.e. co-located reverse proxy)
      - cidr: 127.0.0.1/32
        source: Peer
  routes:
    - conditions:
      - prefix: /
      services:
        - name: s1
          port: 80
      # route-level ip filters override the virtualhost-level filters
      ipAllowPolicy:
      # traffic is allowed if it came from localhost (i.e. co-located reverse proxy)
      - cidr: 127.0.0.1/32
        source: Peer
      # and the request originated from an IP in this range 
      - cidr: 99.99.0.0/16
        source: Remote

Other Gateway Implementations

Other Gateway implementations normally just delegate the Authz to an external process, such as an OPA.

Nginx: https://github.com/summerwind/opa-nginx-rbac Kong: https://docs.konghq.com/hub/kong-inc/opa Gloo: https://docs.solo.io/gloo-edge/latest/guides/security/auth/extauth/opa

Of all the implementations studied, Istio Authz and APISIX Authz APIs align most closely with EG's requirements.

I've updated this PR to reflect the findings. While the current API is flexible, it may be too complex. Below is a proposal for a less complex variation that I prefer(It's just the draft for the structure, we still need to work on naming):

// Authorization defines the authorization configuration.
// The evaluation is determined by the following rules:
// 1. If a request matches any one of the Deny policy, the request gets rejected with 401
// 2. Then, if a request matches any one of the Allow rules, the request is allowed
// 3. If a request doesn't match any one of the Allow rules, the request is rejected with 401
type Authorization struct {
	Deny  []Policy `json:"deny,omitempty"`
	Allow []Policy `json:"allow,omitempty"`
	Audit   []Policy `json:"log,omitempty"`
}


type Policy struct {
	Subject Subject `json:"subjects,omitempty"`
	
	Permissions []string `json:"permissions,omitempty"`
}

type Subject struct {
	ClientCIDR []string `json:"clientCIDR,omitempty"`
}

@arkodg @zetaab @guydc

zhaohuabing avatar May 08 '24 21:05 zhaohuabing

type Authorization struct {
	Deny  []Policy `json:"deny,omitempty"`
	Allow []Policy `json:"allow,omitempty"`
	Log   []Policy `json:"log,omitempty"`
}

in which order these are going to be applied? it might be issue? or is there really need to have combination of different action types

zetaab avatar May 09 '24 11:05 zetaab

type Authorization struct {
	Deny  []Policy `json:"deny,omitempty"`
	Allow []Policy `json:"allow,omitempty"`
	Log   []Policy `json:"log,omitempty"`
}

in which order these are going to be applied? it might be issue? or is there really need to have combination of different action types

The evaluation is determined by the following rules:

  1. If a request matches any one of the Deny policy, the request gets rejected with 401
  2. Then, if a request matches any one of the Allow rules, the request is allowed
  3. If a request doesn't match any one of the Allow rules, the request is rejected with 401

zhaohuabing avatar May 09 '24 16:05 zhaohuabing

hmm it could work in that case, if its possible to implement like that

zetaab avatar May 09 '24 17:05 zetaab

hmm it could work in that case, if its possible to implement like that

@arkodg mentioned a use case in today's meeting: allowing a small hole in a large CIDR block. This won't work for that because Deny goes first, so we need to use the other option in the PR code.

zhaohuabing avatar May 09 '24 18:05 zhaohuabing