argo-cd
argo-cd copied to clipboard
Allow multiple clusters pointing to the same URL
See original discussion highlighting the issue below
Summary
It is currently impossible to define multiple clusters pointing to the same URL, even if they have different name.
Motivation
Similar to what is discussed in the discussion below, I was trying to set up multiple clusters pointing to the same URL but managing different namespaces. This did not work because internally ArgoCD refers to clusters by its URL. This caused ArgoCD to take one of the clusters, put it into cache and use it for the following operations even if the resources were on clusters with different names.
Example: 2 clusters are configured:
- dev, configured to manage namespace dev with URL http://kubernetes.default.svc
- staging, configured to manage namespace staging http://kubernetes.default.svc And two applications, one targeting dev:
destination:
namespace: dev
name: dev
And one targeting staging:
destination:
namespace: staging
name: staging
When the first reconciliation of dev happens, ArgoCD chooses one cluster (seemingly randomly) and puts in into the cluster cache. If it chooses staging, then the operation will fail with
Failed to load live state: Namespace "dev" for AppProject "x" is not managed
since the staging cluster does not manage the dev namespace.
Proposal
Instead of using the server URL to refer to a cluster, it should use a pair of the name and server URL. This would make it possible to differentiate between clusters using the same server URL.
Discussed in https://github.com/argoproj/argo-cd/discussions/9388
Originally posted by mFranz82 May 12, 2022 I am working in a Rancher environment where a DEV team belongs to a Rancher project with corresponding rights within the cluster. As ArgoCD does not provide the option to specify service accounts on project or application level we thought we could wrap a cluster around each project providing a cluster scoped service account. Something like a virtual cluster per team pointing to the same k8s api:
dev-cluster-team-a (api url) > project > application > Sync actions using SA argcd-manager-team-a dev-cluster-team-b (api url) > project > application > Sync actions using SA argcd-manager-team-b
When starting with the implementation we quickly realised that the API url is used by ArgoCD to identify the cluster which of course won't work in our setup. We simply can not create a ArgoCD cluster pointing to the same api twice.
Do you think this is intentional? Are there any concept considerations which I missed?
Update:
We found a simple solution:
Simple create a Service (External Name) per cluster pointing to the same api.
#14255 should cover your use case (i.e impersonation will be the path which Argo will use in the near future)
Think this is a duplicate of #10897
Correct me if I'm wrong but it looks like #14255 would cover the service account use-case, but it still does not differentiate between clusters with the same URL and we do have use cases for that as well. We would like to be able to have multiple "virtual" clusters with different properties pointing to the same real cluster as mentioned by @agaudreault-jive in the linked discussion.
I think the main issue is that the current setup is somewhat unintuitive, I think it is a fair assumption that if clusters have a name set, they should be differentiated by their name and not by their URL. ArgoCD also allows it without any warnings and it breaks in unexpected ways. As described in the issue, operations will be done on random clusters and the cluster settings will be completely broken since they also use the URL only.
While the ideal solution IMO would be to support using the name, if it is not possible I think it should at least prevent using the same URL to avoid unexpected behaviours.
It is the same subject as https://github.com/argoproj/argo-cd/pull/10897 but I couldn't find an issue highlighting this problem so I figured discussions would be easier to track as an issue rather than under a PR with a specific implementation.
I was reading https://github.com/argoproj/argo-cd/pull/14255 and while impersonation would be a great feature, I also think it is slightly unrelated. I see this issue as a change to how the control plane credential is used to maintain a cache, while impersonation is about how an app is synced.
I don't think the server url can be used as a primary key anymore now that we can provide namespace and clusterResource on a cluster. I think the name+server should be used instead, but this change probably extend to the gitops engine.
If impersonation is available, this issue becomes a bit unnecessary for permissions. However , it must have validation to restrict multiple clusters with the same URL and documentation can be written for the workaround to create a different cname, or a k8s service for the local cluster.
few options I am thinking of
Option 1: Add validation without impersonation
- fix current unpredictable behavior
- No way to have separate permission
Option 2: Add name+server support
- transparent to users
- fix current unpredictable behavior
- works with or without impersonation
- more work than simple validation
Option 3: Add validation with impersonation
- fix current unpredictable behavior
- users can have separate permissions with the impersonation
I would like to have this implemented, I think it would be useful to be able to provide a way to help shard large of apps and be able to use multiple argocd controllers for the same cluster
I think I'd prefer if I could still impersonate on the cluster config level. The config resembling kubeconfig would go a long way in matching user expectations. That means referring to the same cluster using multiple different contexts, defining namespace per context, defining user per context (eg exec, impersonate, whatever kubeconfig supports.
IMHO reinventing a config format that the kubernetes client already provides with kubeconfig is doing existing k8s users a disservice, causing unnecessary friction between the various tools.
Please start standardizing on the kubeconfig format for cluster connectivity.
How about creating several CNAME records for the API Server and using different hostnames for the Applications in different namespaces? We plan to use this as a workaround to shard the applications on one cluster.
Codecov Report
Attention: Patch coverage is 0%
with 16 lines
in your changes are missing coverage. Please review.
Project coverage is 44.92%. Comparing base (
a95d595
) to head (709a068
). Report is 1 commits behind head on master.
Files | Patch % | Lines |
---|---|---|
reposerver/repository/repository.go | 0.00% | 12 Missing :warning: |
cmpserver/plugin/plugin.go | 0.00% | 4 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #15107 +/- ##
==========================================
- Coverage 44.94% 44.92% -0.02%
==========================================
Files 354 354
Lines 47739 47753 +14
==========================================
Hits 21454 21454
- Misses 23482 23496 +14
Partials 2803 2803
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
See original discussion highlighting the issue below
Summary
It is currently impossible to define multiple clusters pointing to the same URL, even if they have different name.
Motivation
Similar to what is discussed in the discussion below, I was trying to set up multiple clusters pointing to the same URL but managing different namespaces. This did not work because internally ArgoCD refers to clusters by its URL. This caused ArgoCD to take one of the clusters, put it into cache and use it for the following operations even if the resources were on clusters with different names.
Example: 2 clusters are configured:
* dev, configured to manage namespace dev with URL http://kubernetes.default.svc * staging, configured to manage namespace staging http://kubernetes.default.svc And two applications, one targeting dev:
destination: namespace: dev name: dev
And one targeting staging:
destination: namespace: staging name: staging
When the first reconciliation of dev happens, ArgoCD chooses one cluster (seemingly randomly) and puts in into the cluster cache. If it chooses staging, then the operation will fail with
Failed to load live state: Namespace "dev" for AppProject "x" is not managed
since the staging cluster does not manage the dev namespace.Proposal
Instead of using the server URL to refer to a cluster, it should use a pair of the name and server URL. This would make it possible to differentiate between clusters using the same server URL.
Discussed in #9388
Originally posted by mFranz82 May 12, 2022 I am working in a Rancher environment where a DEV team belongs to a Rancher project with corresponding rights within the cluster. As ArgoCD does not provide the option to specify service accounts on project or application level we thought we could wrap a cluster around each project providing a cluster scoped service account. Something like a virtual cluster per team pointing to the same k8s api:
dev-cluster-team-a (api url) > project > application > Sync actions using SA argcd-manager-team-a dev-cluster-team-b (api url) > project > application > Sync actions using SA argcd-manager-team-b
When starting with the implementation we quickly realised that the API url is used by ArgoCD to identify the cluster which of course won't work in our setup. We simply can not create a ArgoCD cluster pointing to the same api twice.
Do you think this is intentional? Are there any concept considerations which I missed?
Update:
We found a simple solution:
Simple create a Service (External Name) per cluster pointing to the same api.
@alexymantha Can you share the ExternalName service you created as we've done this and still couldn't get it working
I tried messing around with ExternalName, used this:
---
apiVersion: v1
kind: Service
metadata:
name: external-dns-in-cluster
namespace: argocd
spec:
type: ExternalName
externalName: kubernetes.default.svc.cluster.local
ports:
- port: 443
---
apiVersion: v1
kind: Secret
metadata:
name: cluster-external-dns
namespace: argocd
labels:
argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
name: external-dns-in-cluster
server: https://external-dns-in-cluster.argocd.svc
But the app fails on the certificate validation because the Subject Alternative Name (SAN) doesn't contain my DNS CNAME alias. So DNS CNAME is out of the question unless you can control the cert SAN-s.
My error message (I'm running on AWS EKS):
Get "https://external-dns-in-cluster.argocd.svc/version?timeout=32s": tls: failed to verify certificate: x509: certificate is valid for <redacted>.us-west-2.eks.amazonaws.com, ip-<redacted>.us-west-2.compute.internal, kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster.local, not external-dns-in-cluster.argocd.svc
I doubt the solution is a simple ExternalName (or any other DNS CNAME), as the above SAN issue would still be present.
BTW a different workaround mentioned in #2288 involving a URL query parameter worked for me:
https://kubernetes.default.svc?__scope=external-dns-in-cluster
Hi @reegnz did you also encounter the "Failed to load live state: Namespace "y" for AppProject "x" is not managed" issue in addition? As I'm doing the same thing as you with EKS wherein I'm using a URL query parameter but I'm getting that error and am going crazy trying to figure out what I'm missing as everything else looks fine 😰
Nope, I gave up once I ran into the SAN issue, as there's no way to work around that one.