argo-cd
argo-cd copied to clipboard
Cluster names cannot be used via RBAC
If you are trying to resolve an environment-specific issue or have a one-off question about the edge case that does not require a feature then please consider asking a question in argocd slack channel.
Relevant slack thread for issue with @jannfis
Checklist:
- [x] I've searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
- [x] I've included steps to reproduce the bug.
- [x] I've pasted the output of
argocd version
.
Describe the bug The RBAC policy for clusters still require you to use a URL, when names are now a convention.
To Reproduce
Create a policy like below
p, testaccount, clusters, get, in-cluster, allow
Check to see if testaccount
is able to see the cluster - it should not be able to. Change the policy to the url syntax:
p, testaccount, clusters, get, https://kubernetes.default.svc, allow
validate that testaccount
is now able to see the cluster
Expected behavior
The ability to interchangeably be able to use names and URLs for cluster reference in RBAC
Screenshots
Would just be redundant.
Version
argocd: v1.7.6+b04c25e
BuildDate: 2020-09-19T00:50:44Z
GitCommit: b04c25eca8f1660359e325acd4be5338719e59a0
GitTreeState: clean
GoVersion: go1.14.1
Compiler: gc
Platform: linux/amd64
argocd-server: v1.7.8+ef5010c
BuildDate: 2020-10-15T22:34:12Z
GitCommit: ef5010c3a0b5e027fd642732d03c5b0391b1e574
GitTreeState: clean
GoVersion: go1.14.1
Compiler: gc
Platform: linux/amd64
Ksonnet Version: v0.13.1
Kustomize Version: {Version:kustomize/v3.6.1 GitCommit:c97fa946d576eb6ed559f17f2ac43b3b5a8d5dbd BuildDate:2020-05-27T20:47:35Z GoOs:linux GoArch:amd64}
Helm Version: version.BuildInfo{Version:"v3.2.0", GitCommit:"e11b7ce3b12db2941e90399e874513fbd24bcb71", GitTreeState:"clean", GoVersion:"go1.13.10"}
Kubectl Version: v1.17.8
I am taking a look at this, pretty familiar with cluster names and urls.
I don't know what kind of rules the user will define for cluster access, he can define all with cluster url, all with cluster name or a mix. So my initial idea was to apply the enforce on both url and name and then if one gives a deny then it is not allowed (no matter what the other result would be). But there are scenarios for which it doesn't work, like this one:
- we have a cluster minikube/http://192.168.0.1
- the only rule for user alice is:
p, alice, clusters, get, minikube, allow
and this rule should allow the user to see the minikube cluster - when I enforce on cluster url I will get deny, when I enforce on cluster name I will get allow, so based on my logic I should deny this, which contradicts the rule intention
So I could go with the idea that if I find an allow, then I will allow the action. But then I can have policies like below which will not work ok:
p, alice, clusters, get, *, allow
p, alice, clusters, get, minikube, deny
On the above alice
should not see the minikube cluster, but she will be able to because I found an allow rule.
I guess it is somehow the more specific rules have a bigger weight. I see casbin has some API to check for specific rules (https://casbin.org/docs/en/management-api#hasnamedpolicy), but I am not sure this is the right direction, it seems to get more complex to build something to extend enforce.
Any other ideas?
We actually build all of our oidc group scopes to where we allow cluster by group. When the regex implementation is decided as a replacement for glob, this should address your issue inherently. Until then we are just being verbosely specific. Will be easier with just names too.
This is related to the action we are trying to validate, not the policies. The action get cluster can be validated on cluster url or cluster name and they can have different results. The question is which one will we chose as correct? I can't find a good balance until now. I don't see how regex introduction is going to help, this will modify the policies, not the action.
So I could go with the idea that if I find an allow, then I will allow the action. But then I can have policies like below which will not work ok:
p, alice, clusters, get, *, allow p, alice, clusters, get, minikube, deny
On the above
alice
should not see the minikube cluster, but she will be able to because I found an allow rule.
I think this is due to the Casbin model we use. Currently, we are using allow-and-deny
, but for a policy such as above, we should migrate to deny-override
model.
@jannfis I still don't think it is related to the policies For example the line below checks Get action for Clusters on q.Server (which is the cluster url). I wanted to check also on q.Name, but then I will have 2 results that might contradict each other. I can't find a good logic to combine the 2 results https://github.com/argoproj/argo-cd/blob/master/server/cluster/cluster.go#L107
I think it would be a bad policy to deny https://kubernetes.default.svc
but allow in-cluster
(or vice versa).
So can we safely assume the following?
errSrv := s.enf.EnforceErr(ctx.Value("claims"), rbacpolicy.ResourceClusters, rbacpolicy.ActionGet, q.Server)
errNm := s.enf.EnforceErr(ctx.Value("claims"), rbacpolicy.ResourceClusters, rbacpolicy.ActionGet, q.Name)
if errSrv != nil && errNm != nil {
return nil, errSrv
}
// we had at least one match, either on server or name
...
In such a case if you want to deny production cluster only and allow access to all others what would be the policy?
The below will give allow on url and deny on name, so the final result will be allow, but I think the user wanted no access for production (no matter if url or name was used).
p, alice, clusters, get, *, allow
p, alice, clusters, get, production, deny
So I think we will end up using both url and name to deny.
deny
rules won't override an existing allow
rule with our current RBAC model.
This is what I was trying to say in https://github.com/argoproj/argo-cd/issues/4712#issuecomment-724176472
I made a unit test specific for this case, with the current rbac logic.
Policy will be
p, mike, clusters, get, *, allow
p, mike, clusters, get, https://192.168.0.1, deny
And I will query for cluster
clusterapi.ClusterQuery{
Name: "minikube",
Server: "https://192.168.0.1",
}
The enforce will fail. You can see the test here: https://github.com/lcostea/argo-cd/blob/lcostea/rbac_cluster_names/server/cluster/cluster_test.go#L149
So the require.NoError(t, err)
does fail?
That is weird. I just tested the following policy with two apps in the cluster, guestbook
and guestbook2
(policy.default
is set to empty)
p, role:user, applications, get, *, deny
p, role:user, applications, get, guestbook, allow
g, test, role:user
Logged in as local user test
:
$ ./dist/argocd app list
NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY CONDITIONS REPO PATH TARGET
$ ./dist/argocd app get guestbook
FATA[0000] rpc error: code = PermissionDenied desc = permission denied: applications, get, default/guestbook, sub: test, iat: 2020-11-09T20:22:29+01:00
Likewise, when I change policy to
p, role:user, applications, get, *, allow
p, role:user, applications, get, guestbook, deny
g, test, role:user
the outcome is:
./dist/argocd app list
NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY CONDITIONS REPO PATH TARGET
guestbook https://kubernetes.default.svc guestbook default OutOfSync Missing <none> <none> https://github.com/argoproj/argocd-example-apps guestbook HEAD
guestbook2 https://kubernetes.default.svc guestbook2 gpg OutOfSync Missing <none> <none> https://github.com/argoproj/argocd-example-apps guestbook HEAD
$ ./dist/argocd app get guestbook
Name: guestbook
Project: default
Server: https://kubernetes.default.svc
....
Yes, it fails. But if you say the above scenario with applications is correct I will check again the code, maybe I am missing some initialisations on rbac. Thanks for the help.
Sorry, my fault. I used wrong format for the policy. Please ignore everything I wrote about allow/deny.
Just ran into this since I couldn't use cluster names in declaratively defined AppProjects (though I can on Applications).
Just ran into this since I couldn't use cluster names in declaratively defined AppProjects (though I can on Applications).
Almost the same here - I want to allow OIDC groups be able to see just the clusters they can use, but I don´t have the URL at this point (and using AWS load balanacers, the cluster name is not part of the URL, so regex can´t help). Referencing clusters by name in RBAC policies would really come in handy :) Any news on this?
Can confirm this is present on v2.7.6
. It only applies when restricting access to clusters
in the RBAC policy. It does not apply to projects
and applications
.
E.g.:
p, role:my-team, clusters, get, my-tenant-cluster, allow ❌
p, role:my-team, projects, get, my-project, allow ✅
p, role:my-team, applications, get, my-project/*, allow ✅
Can also confirm that using cluster names as destination in Application
and AppProject
manifests does work ok.
It happens on v2.8.2+dbdfc71
too, we had to pass the server instead of name to restrict cluster view!
Just ran into this since I couldn't use cluster names in declaratively defined AppProjects (though I can on Applications).
Almost the same here - I want to allow OIDC groups be able to see just the clusters they can use, but I don´t have the URL at this point (and using AWS load balanacers, the cluster name is not part of the URL, so regex can´t help). Referencing clusters by name in RBAC policies would really come in handy :) Any news on this?
Also an issue with AWS EKS as EKS provides you with a URL for k8s API.
From an intuition standpoint, I wasn't expecting the URL to be what it evaluated against and was slightly confused at first, though trying based on the URL was my first instinct otherwise.
This URL is randomly generated, so it's not really possible to know what it might be in advance nor is it really possible to regex/wildcard off something generated.
Simply wrapping a CNAME around that generated URL won't work either as the CN won't match thus certificate verification fails, though I'm wondering if that's override-able somehow. I only noticed this might be possible looking at https://github.com/oboukili/terraform-provider-argocd/blob/master/docs/resources/cluster.md#nested-schema-for-configtls_client_config
If that doesn't work than I'd have to run a proxy or something. And this is just my case, which is perhaps a bit of a subset of others who wish to evaluate by cluster name for whatever other reasons.
Edit: I was able to make this work; edited cluster secret to change URL to my CNAME, added serverName
to the tlsClientConfig
.. Helps my situation a bit but doesn't solve the original issue :)
Can we have this check against name instead or something or some method of doing both?
Of course I understand that just switching to name would have backwards compatibility issues and doing an OR on either could be unexpected by users not aware of the change and could result in a potentially looser role than intended, but in a similar vein, I was able to add preferred_username
under data.scopes
in the policy configmap to enable me to auth against usernames when using SSO w/Dex+GitLab.
Perhaps an option added there to choose which for clusters might work? data.useClusterName: [boolean]
??