argo-cd
argo-cd copied to clipboard
trouble using --aws-role-arn option when adding EKS cluster with argocd CLI
I am trying to use the --aws-role-arn
option when adding an EKS cluster to ArgoCD, as described in https://github.com/argoproj/argo-cd/issues/1304. I have not been able to get it to work and the error messages are difficult to interpret and I am not sure how to debug.
-
I have ArgoCD running in one AWS account and my EKS cluster is in another AWS account
-
I have set up the
acme-production-deploy-role
so that it can be assumed both by the AWS role that I am using to runargocd cluster add ...
and by the EC2 instances in my ArgoCD cluster (I am confused about which IAM identity is used to assume the role so I tried to allow both to work). -
Here is what I see when I try to add the cluster. (I have redacted the AWS account numbers and the EKS id, but confirmed that I used the correct values for these):
$ argocd cluster add acme-production --aws-cluster-name arn:aws:eks:us-west-2:<account-number>:cluster/acme-production --aws-role-arn arn:aws:iam::<account-number>:role/acme-production-deploy-role
FATA[0000] rpc error: code = Unknown desc = REST config invalid: Get https://<eks-cluster-id>.yl4.us-west-2.eks.amazonaws.com/version?timeout=32s: getting credentials: exec: exit status 1
Note that I am able to successfully add the cluster using argocd cluster add https://<eks-cluster-id>.yl4.us-west-2.eks.amazonaws.com
Thanks
I'm not sure of the minimal permissions necessary, but I was able to get argocd to add an external cluster (cross account) by doing the following:
In account B (external cluster), created a role argocd-test with AdministratorAccess policy attached, and created a trust relationship between it and account A (running argocd in an eks cluster).
In the external cluster, 'kubectl edit -n kube-system configmap/aws-auth' to edit the iam role to RBAC mappings, under mapped roles added:
- rolearn: arn:aws:iam::accountB_number:role/argocd-test
username: arn:aws:iam::accountB_number:role/argocd-test
groups:
- system:bootstrappers
- system:nodes
Note that the username has to be the same as the rolearn; if not, when trying to add the external cluster to argocd, will get a 'the server has asked for the client to provide credentials' error
In account A, that's running argocd, attached a policy allowing assumption of the argocd-test role in account B to the iam role on the EC2 nodes running argocd. (May be possible to map the role to a service account used by the argocd pods; I didn't test whether argocd is using a new enough aws-sdk for that support yet.)
{
"Version": "2012-10-17",
"Statement": {
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "arn:aws:iam::accountB_number:role/argocd-test"
}
}
In my kubeconfig, the external cluster in account B is registered as 'arn:aws:eks:us-east-1:accountB_number:cluster/kube_remote'. With argocd logged in to the cluster in account A, the following command to add the external cluster in the other account was accepted:
argocd cluster add arn:aws:eks:us-east-1:accountB_number:cluster/kube_remote --aws-role-arn arn:aws:iam::accountB_number:role/argocd-test --aws-cluster-name kube_remote
Note that both --aws-role-arn and --aws-cluster-name are needed to make argocd use the aws-iam-authenticator to assume the role in account B. Also, --aws-cluster-name has to be just the name of the external cluster - not the full arn.
I am using Kube 1.14 in EKS, and argocd 1.2.2.
I have the same issue but I'm using terraform to create the secret just after the cluster has been created but I can't get it working
I followed this structure https://github.com/argoproj/argo-cd/blob/master/docs/operator-manual/declarative-setup.md#clusters
resource "kubernetes_secret" "dev-green-argocd" {
# this secret gets deployed on CI cluster
# to allow ArgoCD to access the new cluster
provider = kubernetes.ci
metadata {
labels = {
"argocd.argoproj.io/secret-type" = module.cluster_green.eks_cluster.name
}
name = module.cluster_green.eks_cluster.name
namespace = "argocd"
}
data = {
server = module.cluster_green.eks_cluster.endpoint
name = module.cluster_green.eks_cluster.name
config = <<CONFIG
{
"awsAuthConfig": {
"clusterName": "${module.cluster_green.eks_cluster.name}",
"roleARN": "${module.base.admin_role.arn}"
},
"tlsClientConfig": {
"insecure": false,
"caData": "${module.cluster_green.eks_cluster.certificate_authority.0.data}"
}
}
CONFIG
}
}
but I can't find any logs related to detecting this new secret, cluster name, cluster endpoint on the repo-server
, server
or application-controller
I was also trying to guess how does ArgoCD uses aws-iam-authenticator
and which component sends the request to the other AWS clusters/accounts
I've inferred from some tickets and conversations that is the server
component the one in charge to send the request so this component should have some sort of AWS SDK setup somewhere...
- aws/config files
- environment variables
- IAM Role attached to the PODs
But this it not documented and I'm struggling to automate this process
I'd happily document (crating a PR) the IAM configuration required if I could understand how all this works
when I try to add the cluster on the CLI, I am seeing the following in the argocd-server logs:
argocd-server-54cd8c56d9-lw7ff argocd-server 2020-06-12T01:43:54.914385589Z An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts::1111:assumed-role/argocd-prod-workers-role/i-2222 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::1111:role/aws20200611003524576400000005
need to know how to setup a role that can be assumed by a role!
Ok, so if you are doing this in the same account, here is what I did. I created a role to put in my target clusters aws-auth configmap.... Like so:
- "groups":
- "system:bootstrappers"
- "system:nodes"
"rolearn": "arn:aws:iam::my account number:role/ArgoCDTest"
"username": "arn:aws:iam::my account number:role/ArgoCDTest"
This role trusts the root, eg:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::my account number:root"
},
"Action": "sts:AssumeRole",
"Condition": {}
}
]
}
The worker node role of the argocd cluster need to have STS:assumeRole permissions to this new role.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": "sts:*",
"Resource": "arn:aws:iam::my account number:role/ArgoCDTest"
}
]
}
This has now gotten me passed the AWS error above, but now I have a new error:
FATA[0006] rpc error: code = Unknown desc = REST config invalid: the server has asked for the client to provide credentials
ok, it looks like this error is a 'user issue', eg: my fault. ;) I am trying to add the cluster like so:
argocd cluster add arn:aws:eks:ap-southeast-2:my account number:cluster/my-cluster-name --aws-role-arn "arn:aws:iam::my account number:role/ArgoCDTest" --aws-cluster-name kube_remote
the --aws-cluster-name was incorrect, should be 'my-cluster-name', not 'kube_remote'. Once I discovered this:
$ argocd cluster add arn:aws:eks:ap-southeast-2:my account number:cluster/my-cluster-name --aws-cluster-name my-cluster-name --aws-role-arn "arn:aws:iam::my account number:role/ArgoCDTest"
Cluster 'https://long and anonymous string.yl4.ap-southeast-2.eks.amazonaws.com' added
and in the argocd logs:
argocd-server-54cd8c56d9-lw7ff argocd-server 2020-06-12T02:45:02.465689783Z time="2020-06-12T02:45:02Z" level=info msg="Alloc=10468 TotalAlloc=5311811 Sys=72272 NumGC=1253 Goroutines=153"
argocd-server-54cd8c56d9-79tjn argocd-server 2020-06-12T02:45:29.49700286Z time="2020-06-12T02:45:29Z" level=info msg="Starting configmap/secret informers"
argocd-server-54cd8c56d9-79tjn argocd-server 2020-06-12T02:45:29.497064166Z time="2020-06-12T02:45:29Z" level=info msg="configmap informer cancelled"
argocd-server-54cd8c56d9-79tjn argocd-server 2020-06-12T02:45:29.497074411Z time="2020-06-12T02:45:29Z" level=info msg="secrets informer cancelled"
argocd-server-54cd8c56d9-lw7ff argocd-server 2020-06-12T02:45:29.498149611Z time="2020-06-12T02:45:29Z" level=info msg="Notifying 1 settings subscribers: [0xc0008449c0]"
argocd-server-54cd8c56d9-79tjn argocd-server 2020-06-12T02:45:29.597015242Z time="2020-06-12T02:45:29Z" level=info msg="Configmap/secret informer synced"
argocd-server-54cd8c56d9-79tjn argocd-server 2020-06-12T02:45:29.597607321Z time="2020-06-12T02:45:29Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=Create grpc.service=cluster.ClusterService grpc.start_time="2020-06-12T02:45:26Z" grpc.time_ms=3573.327 span.kind=server system=grpc
but!
in the UI it is still failed:
but I can see no other sign to why it is failed.
ok, I have it working now... I updated my 1.6 install to 'latest'.. it works in master/latest.
antoher thing of note, I tried not using the IAM role of the nodes but using the OIDC features in EKS:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::1111:oidc-provider/oidc.eks.ap-south-1.amazonaws.com/id/1111"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"oidc.eks.ap-south-1.amazonaws.com/id/1111:sub": "system:serviceaccount:argocd:argocd-server"
}
}
}
]
}
I still received the same error:
An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts::1111:assumed-role/operational-1111/i-090dcd405f3f70755 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::1111:role/argocd-manager-k8s00-dev-aws
Maybe this can be supported?
@jurgenweber any luck in getting this to work with a ServiceAccount?
I am struggling with the same thing. EKS with ArgoCD in one cluster, wanting to add external EKS cluster. I have setup a role in IAM that has an attached trust policy that grants trust via OICD to the two service accounts used by ArgoCD (server and app-controller). These service accounts have been annotated with the IAM role. In the external cluster the role has been added to auth-cm. The pods are already using the role that has rights in the external cluster, so no need to assume a role. I am using declarative secrets for adding the external clusters, and not specifying the ARN for the role (as it does not need to assume one). The clusters show up in the UI, but as failed. No messages in log on argo-cd-server pod. The IAM Roles for Service Accounts setup I am using seems to be successful in the sense that the pod has ENV variables for AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE and the token file is readable and has a token (I did have to set fsGroup to 999 globally).
From the argo-server pod I am able to use use aws eks get-token
(command used by ArgoCD) to retrieve a token from the external cluster.
Here's a rough recipe for what I've managed to get working with IAM roles scoped to a Kubernetes ServiceAccount and a single AWS account:
- use the
latest
tag for the images. I couldn't get this to work with v1.6. - you'll need to set the fsGroup for the securityContext to 999.
- setup the IAM OpenID Connect provider for the EKS cluster running Argo CD by following https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html
- create one IAM policy and one role following https://docs.aws.amazon.com/eks/latest/userguide/create-service-account-iam-policy-and-role.html
- there needs to be 2 trust relationships for the one role: one for the
argocd-server
ServiceAccount and one for theargocd-application-controller
ServiceAccount - the policy should allow
AssumeRole
andAssumeRoleWithWebIdentity
for the STS service with the resources for the policy limited to the ARN for the IAM role
- there needs to be 2 trust relationships for the one role: one for the
- you'll need to add an annotation to the
argocd-server
andargocd-application-controller
ServiceAccounts following https://docs.aws.amazon.com/eks/latest/userguide/specify-service-account-role.html - for the EKS clusters that Argo CD will deploying apps into, add a new entry into the
aws-auth
ConfigMap by following https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html- the
rolearn
andusername
values should match (though I'm not certain that this is required) - for
groups
, you could usesystem:masters
but you might want to restrict scope further
- the
- for the Secret for each cluster, follow https://argoproj.github.io/argo-cd/operator-manual/declarative-setup/#clusters
- be sure to add the label to the Secret as shown in the docs
- for the Secret data:
-
name
should be the ARN of the EKS cluster -
server
should be the URL of the Kubernetes API for the EKS cluster. - in the
config
block only set the following:-
awsAuthConfig
where:-
clusterName
is the name of the EKS cluster as returned byaws eks list-clusters
-
roleARN
is the ARN of the IAM role
-
-
tlsClientConfig
where:-
insecure
isfalse
. This might not be required but it doesn't hurt to be explicit. -
caData
is the certificate returned byaws eks describe-cluster --query "cluster.certificateAuthority" --output text
. It should already be base64 encoded.
-
-
-
- when creating an app with
argocd app create
, set--dest-server
to the URL of the Kubernetes API for the cluster.
@helenabutowatcisco many thanks for the instructions! Worked for me even with external cluster in another account, just had to create argocd-manager role in external account, add it to aws-auth
and allow assume from argocd IRSA role in argocd's account.
Worth noting that after creating Secret with cluster's info, it's status becames Unknown
but it's actually works, just need to create app on it, then it'll change to Successful
. Don't spend time debugging connection like I did :)
Looks like IRSA roles work even with 1.6.2 images. Found out while experimenting and switching between latest and 1.6.2. I suspect that after adding SA annotation with role name pods has to be recreated to pick up changes.
latest image appears to have some issues with resource tracking and constantly shows apps out of sync. One example: cert-manager deployed from proxy helm chart with cluster issuer created from manifest in templates folder (can share details if anybody interested).
Important note on multi-cluster management: application names have to be unique in argocd, so to deploy same app to multiple clusters different app names has to be used. Solution that worked for me:
- use custom label key in argocd-cm
application.instanceLabelKey: argocd.argoproj.io/instance
- create applications with cluster name suffix
cert-manager-stage, cert-manager-poc
- use helm release name to force argocd to create resources without cluster name suffix (don't know yet how to do the same with kustomized manifests, but should be possible I think)
- use different projects per cluster
Final configs (essentials):
argocd/kustomization.yaml:
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: argocd
bases:
- github.com/argoproj/argo-cd/manifests/cluster-install?ref=v1.6.2
patchesStrategicMerge:
- overlays/argocd-application-controller-deployment.yaml
- overlays/argocd-application-controller-sa.yaml
- overlays/argocd-cm.yaml
- overlays/argocd-server-deployment.yaml
- overlays/argocd-server-sa.yaml.yaml
overlays/argocd-application-controller-deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: argocd-application-controller
spec:
template:
spec:
securityContext:
fsGroup: 999
overlays/argocd-application-controller-sa.yaml:
apiVersion: v1
kind: ServiceAccount
metadata:
name: argocd-application-controller
annotations:
eks.amazonaws.com/role-arn: ARGOCD_ROLE
overlays/argocd-cm.yaml:
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-cm
data:
application.instanceLabelKey: argocd.argoproj.io/instance
overlays/argocd-server-deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: argocd-server
spec:
template:
spec:
securityContext:
fsGroup: 999
overlays/argocd-server-sa.yaml.yaml:
apiVersion: v1
kind: ServiceAccount
metadata:
name: argocd-server
annotations:
eks.amazonaws.com/role-arn: ARGOCD_ROLE
Where ARGOCD_ROLE is IRSA IAM role for argocd's cluster with permissions to assume target clusters admin user IAM roles
App config example:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
namespace: argocd
name: cert-manager-{{ .Values.spec.destination.clusterName }} # per-cluser App name
finalizers:
- resources-finalizer.argocd.argoproj.io
destination:
namespace: kube-system
server: {{ .Values.spec.destination.server }} # dest cluster
project: {{ .Values.spec.project }} # per-cluster project
source:
path: applications/cert-manager
repoURL: {{ .Values.spec.source.repoURL }}
targetRevision: {{ .Values.spec.source.targetRevision }}
helm:
releaseName: cert-manager # ensure actual resources won't have cluster name suffix
I think there should be better documentation on how to properly configure awsAuthConfig
Here's a rough recipe for what I've managed to get working with IAM roles scoped to a Kubernetes ServiceAccount and a single AWS account:
use the
latest
tag for the images. I couldn't get this to work with v1.6.you'll need to set the fsGroup for the securityContext to 999.
setup the IAM OpenID Connect provider for the EKS cluster running Argo CD by following https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html
create one IAM policy and one role following https://docs.aws.amazon.com/eks/latest/userguide/create-service-account-iam-policy-and-role.html
- there needs to be 2 trust relationships for the one role: one for the
argocd-server
ServiceAccount and one for theargocd-application-controller
ServiceAccount- the policy should allow
AssumeRole
andAssumeRoleWithWebIdentity
for the STS service with the resources for the policy limited to the ARN for the IAM roleyou'll need to add an annotation to the
argocd-server
andargocd-application-controller
ServiceAccounts following https://docs.aws.amazon.com/eks/latest/userguide/specify-service-account-role.htmlfor the EKS clusters that Argo CD will deploying apps into, add a new entry into the
aws-auth
ConfigMap by following https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html
- the
rolearn
andusername
values should match (though I'm not certain that this is required)- for
groups
, you could usesystem:masters
but you might want to restrict scope furtherfor the Secret for each cluster, follow https://argoproj.github.io/argo-cd/operator-manual/declarative-setup/#clusters
be sure to add the label to the Secret as shown in the docs
for the Secret data:
name
should be the ARN of the EKS cluster
server
should be the URL of the Kubernetes API for the EKS cluster.in the
config
block only set the following:
awsAuthConfig
where:
clusterName
is the name of the EKS cluster as returned byaws eks list-clusters
roleARN
is the ARN of the IAM role
tlsClientConfig
where:
insecure
isfalse
. This might not be required but it doesn't hurt to be explicit.caData
is the certificate returned byaws eks describe-cluster --query "cluster.certificateAuthority" --output text
. It should already be base64 encoded.when creating an app with
argocd app create
, set--dest-server
to the URL of the Kubernetes API for the cluster.
So you were able to make it work with just one role for ArgoCD SA? You didn't have to create any roles for the clusters that ArgoCD was managing?
So you were able to make it work with just one role for ArgoCD SA? You didn't have to create any roles for the clusters that ArgoCD was managing?
Correct. Just one IAM role.
So you were able to make it work with just one role for ArgoCD SA? You didn't have to create any roles for the clusters that ArgoCD was managing?
Correct. Just one IAM role.
We are struggling to make it work, the cluster just shows up as failed
in the UI but there is no logs and no way to enable verbose logging. Not sure what is wrong. Our current setup:
- ArgoCD cluster and destination cluster are in separate AWS accounts
- OIDC asssumable role created for argocd in the AWS with argocd
- aws-auth configMap edited and added the following:
- "groups":
- "system:masters"
"rolearn": "arn:aws:iam::<AWS ACCOUNT WITH ARGOCD>:role/argocd"
"username": "arn:aws:iam::<AWS ACCOUNT WITH ARGOCD>:role/argocd"
- added the annotation to both SA for argocd-application-controller and argocd-server
eks.amazonaws.com/role-arn: arn:aws:iam::<AWS ACCOUNT WITH ARGOCD>:role/argocd
- Created the cluster secret as follow:
apiVersion: v1
kind: Secret
metadata:
name: my-cluster
namespace: argocd
labels:
argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
name: custom-name-for-the-cluster
server: <TARGET_CLUSTER_API_SERVER_ADDRESS>
config: |
{
"insecure": false,
"awsAuthConfig": {
"clusterName": "<REAL_NAME_OF_THE_TARGET_CLUSTER>",
"roleARN": "eks.amazonaws.com/role-arn: arn:aws:iam::<AWS ACCOUNT WITH ARGOCD>:role/argocd"
},
"tlsClientConfig": {
"caData": "xxx"
}
}
Not sure what we are missing 😢
Tackling the issue with @musabmasood, I have copied this kubeconfig in the argocd-server
and argocd-application-server
:
# /home/argocd/.kube/config
apiVersion: v1
kind: Config
preferences: {}
clusters:
- cluster:
certificate-authority-data: <base64-encoded ca-data>
server: https://<eks-cluster-specific-host>.eks.amazonaws.com
name: arn:aws:eks:ap-south-1:<account-id>:cluster/<cluster-name>
users:
- name: arn:aws:eks:ap-south-1:<account-id>:cluster/<cluster-name>
user:
exec:
apiVersion: client.authentication.k8s.io/v1alpha1
args:
- eks
- get-token
- --cluster-name
- <cluster-name>
command: aws
contexts:
- context:
cluster: arn:aws:eks:ap-south-1:<account-id>:cluster/<cluster-name>
user: arn:aws:eks:ap-south-1:<account-id>:cluster/<cluster-name>
name: arn:aws:eks:ap-south-1:<account-id>:cluster/<cluster-name>
And I am able to list pods in the target cluster:
argocd@argocd-server-5cc5c44949-8rqhc:~$ kubectl cluster-info --context arn:aws:eks:ap-south-1:<account-id>:cluster/<cluster-name>
Kubernetes master is running at https://<eks-cluster-specific-host>.eks.amazonaws.com
CoreDNS is running at https://<eks-cluster-specific-host>.eks.amazonaws.com/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Metrics-server is running at https://<eks-cluster-specific-host>.eks.amazonaws.com/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
But with the cluster secret which apparently does the same, it does not work:
https://github.com/argoproj/argo-cd/blob/3d1f37b0c53f4c75864dc7339e2831c6e6a947e0/pkg/apis/application/v1alpha1/types.go#L2182-L2186
apiVersion: v1
kind: Secret
metadata:
name: cluster-<cluster-name>
namespace: argocd
labels:
argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
name: <cluster-name>
server: https://<eks-cluster-specific-host>.eks.amazonaws.com
config: |
{
"awsAuthConfig": {
"clusterName": "<cluster-name>",
},
"tlsClientConfig": {
"caData": "<base64-encoded ca-data>",
"insecure": false,
}
}
We only see it as "Failed" when listing the cluster in the UI. We have increased the log level to debug, but it does not add any new message. We have been able to get getting credentials: exec: exit status 255
when trying to create or sync an application, but I have not been able to identify where it comes from in the code yet.
Tested both v1.6.2 and v1.7.0-rc1 Role config in account where we run ArgoCD
module "iam_assumable_role" {
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc"
version = "~> v2.18.0"
create_role = true
role_name = "k8s-argocd-admin"
provider_url = replace(data.aws_eks_cluster.this.identity.0.oidc.0.issuer, "https://", "")
role_policy_arns = []
oidc_fully_qualified_subjects = ["system:serviceaccount:argocd:argocd-server", "system:serviceaccount:argocd:argocd-application-controller"]
}
data "aws_eks_cluster" "this" {
name = var.argocd_cluster_name
}
Target cluster nonprod-useast1-cluster2 aws-auth configmap:
- "groups":
- "system:masters"
"rolearn": "arn:aws:iam::369115111111:role/k8s-argocd-admin"
"username": "argocd-admin"
ArgoCD cluster config (notice NO role specified here, it's not needed as we allowed access to argocd-managed cluster directly for ArgoCD SA IRSA role):
apiVersion: v1
kind: Secret
metadata:
name: nonprod-useast1-cluster2
labels:
argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
name: nonprod-useast1-cluster2
server: https://8BB8308BEFD883211111111111111.gr7.us-east-1.eks.amazonaws.com
config: |
{
"awsAuthConfig": {
"clusterName": "nonprod-useast1-cluster2"
},
"tlsClientConfig": {
"insecure": false,
"caData": "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJd01EVXlOekUzTkRFeU1Gb1hEVE13TURVeU5URTNOREV5TUZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTmJZCkR4cGdGMHFPdGpWSWhUUVJ0cXdnWC8weEtwdjhRMWJkejdHM0tyU1Z0bWtjYXBIbE94Yit2ODUyRXM5T2liYWMKT0I3eEk4NFpoWnRIZlRJZlNNUG5mSDFsdFUvSXcwZzZ6T0prNGl5bHVrdWxISHJOcXdDY2hvMmRTQno5Sm9NcApuNU1mSGpmV2I5RVppeERZeW5sZXdCK1dWTGIwdkNvTzNNeEloT3RTVG00djB4ZEJMQzBzM29Dd1lmNy9kaFd6CkZsZVZFNmYvY0xkNW1aclRjdlF6TzFzYVIrcEQ4T1FCblVjSXUrT2lXSTV5c2d4SUphWEFJRkZYSU5mWWF5Y2gKZGVJVUdYazAxMjNtMWNRaGJ3eGtEYzhnZkNVMlMxenFHZGVQNElOSGhTcCthSFN4cmJIa3dRYWxO"
}
}
ArgoCD v1.6.2
Cluster created but reporting error:
https://8BB8308BEFD883211111111111111.gr7.us-east-1.eks.amazonaws.com nonprod-useast1-cluster2 Failed Unable to connect to cluster: the server has asked for the client to provide credentials
However, it's possible to deploy applications to it:
gb2 https://8BB8308BEFD883211111111111111.gr7.us-east-1.eks.amazonaws.com guestbook1 default Synced Healthy Auto-Prune <none> https://github.com/argoproj/argocd-example-apps.git kustomize-guestbook HEAD
ArgoCD v1.7.0-rc1 New version has different cluster health discovery logic: it shows a cluster with "unknown status" unless you deploy an app into it. Once app is deployed cluster becomes "green" in UI. App is deployed successfully.
https://8BB8308BEFD883211111111111111.gr7.us-east-1.eks.amazonaws.com nonprod-useast1-cluster2 1.17+ Successful
So I'm pretty sure it was UI bug or bug related to cluster discovery mechanism and it's resolved in v1.7.
Confirmed. It works in 1.7 rc1. We only specify the caData and cluster name (not cluster ARN). We also added the role arn to application-controller and server SAs.
On Sun., Aug. 23, 2020, 16:49 savealive [email protected] wrote:
Tested both v1.6.2 and v1.7.0-rc1 Role config in account where we run ArgoCD
module "iam_assumable_role" { source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc" version = "~> v2.18.0" create_role = true role_name = "k8s-argocd-admin" provider_url = replace(data.aws_eks_cluster.this.identity.0.oidc.0.issuer, "https://", "") role_policy_arns = [] oidc_fully_qualified_subjects = ["system:serviceaccount:argocd:argocd-server", "system:serviceaccount:argocd:argocd-application-controller"] }
data "aws_eks_cluster" "this" { name = var.argocd_cluster_name }
Target cluster nonprod-useast1-cluster2 aws-auth configmap:
- "groups": - "system:masters" "rolearn": "arn:aws:iam::369115111111:role/k8s-argocd-admin" "username": "argocd-admin"
ArgoCD cluster config (notice NO role specified here, it's not needed as we allowed access to argocd-managed cluster directly for ArgoCD SA IRSA role):
apiVersion: v1 kind: Secret metadata: name: nonprod-useast1-cluster2 labels: argocd.argoproj.io/secret-type: cluster type: Opaque stringData: name: nonprod-useast1-cluster2 server: https://8BB8308BEFD883211111111111111.gr7.us-east-1.eks.amazonaws.com config: | { "awsAuthConfig": { "clusterName": "nonprod-useast1-cluster2" }, "tlsClientConfig": { "insecure": false, "caData": "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJd01EVXlOekUzTkRFeU1Gb1hEVE13TURVeU5URTNOREV5TUZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTmJZCkR4cGdGMHFPdGpWSWhUUVJ0cXdnWC8weEtwdjhRMWJkejdHM0tyU1Z0bWtjYXBIbE94Yit2ODUyRXM5T2liYWMKT0I3eEk4NFpoWnRIZlRJZlNNUG5mSDFsdFUvSXcwZzZ6T0prNGl5bHVrdWxISHJOcXdDY2hvMmRTQno5Sm9NcApuNU1mSGpmV2I5RVppeERZeW5sZXdCK1dWTGIwdkNvTzNNeEloT3RTVG00djB4ZEJMQzBzM29Dd1lmNy9kaFd6CkZsZVZFNmYvY0xkNW1aclRjdlF6TzFzYVIrcEQ4T1FCblVjSXUrT2lXSTV5c2d4SUphWEFJRkZYSU5mWWF5Y2gKZGVJVUdYazAxMjNtMWNRaGJ3eGtEYzhnZkNVMlMxenFHZGVQNElOSGhTcCthSFN4cmJIa3dRYWxO" } }
ArgoCD v1.6.2 Cluster created but reporting error:
https://8BB8308BEFD883211111111111111.gr7.us-east-1.eks.amazonaws.com nonprod-useast1-cluster2 Failed Unable to connect to cluster: the server has asked for the client to provide credentials
However, it's possible to deploy applications to it:
gb2 https://8BB8308BEFD883211111111111111.gr7.us-east-1.eks.amazonaws.com guestbook1 default Synced Healthy Auto-Prune
https://github.com/argoproj/argocd-example-apps.git kustomize-guestbook HEAD ArgoCD v1.7.0-rc1 New version has different cluster health discovery logic: it shows a cluster with "unknown status" unless you deploy an app into it. Once app is deployed cluster becomes "green" in UI. App is deployed successfully.
https://8BB8308BEFD883211111111111111.gr7.us-east-1.eks.amazonaws.com nonprod-useast1-cluster2 1.17+ Successful
So I'm pretty sure it was UI bug or bug related to cluster discovery mechanism and it's resolved in v1.7.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/argoproj/argo-cd/issues/2347#issuecomment-678823384, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB6KUUSTQPEVQ3T6JNDRJCLSCF6DJANCNFSM4IZP7LNA .
@maxbrunet try to shell into the pod and run aws eks --region us-east-1 update-kubeconfig --name cluster-2
. That way I found out about missing security context (fsGroup: 999
) - aws
cli couldn't access IRSA token.
@okdas yes, the fsGroup
is needed (when running as non-root), the kubectl cluster-info
command I posted runs inside the pod using the WebIdendity token, so it was readable, and as @musabmasood said, we made it work with 1.7. Thanks
any luck in getting this to work with a ServiceAccount?
@helenabutowatcisco yes, I got it all working. I hope you did also.
Here's a rough recipe for what I've managed to get working with IAM roles scoped to a Kubernetes ServiceAccount and a single AWS account:
use the
latest
tag for the images. I couldn't get this to work with v1.6.you'll need to set the fsGroup for the securityContext to 999.
setup the IAM OpenID Connect provider for the EKS cluster running Argo CD by following https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html
create one IAM policy and one role following https://docs.aws.amazon.com/eks/latest/userguide/create-service-account-iam-policy-and-role.html
- there needs to be 2 trust relationships for the one role: one for the
argocd-server
ServiceAccount and one for theargocd-application-controller
ServiceAccount- the policy should allow
AssumeRole
andAssumeRoleWithWebIdentity
for the STS service with the resources for the policy limited to the ARN for the IAM roleyou'll need to add an annotation to the
argocd-server
andargocd-application-controller
ServiceAccounts following https://docs.aws.amazon.com/eks/latest/userguide/specify-service-account-role.htmlfor the EKS clusters that Argo CD will deploying apps into, add a new entry into the
aws-auth
ConfigMap by following https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html
- the
rolearn
andusername
values should match (though I'm not certain that this is required)- for
groups
, you could usesystem:masters
but you might want to restrict scope furtherfor the Secret for each cluster, follow https://argoproj.github.io/argo-cd/operator-manual/declarative-setup/#clusters
be sure to add the label to the Secret as shown in the docs
for the Secret data:
name
should be the ARN of the EKS cluster
server
should be the URL of the Kubernetes API for the EKS cluster.in the
config
block only set the following:
awsAuthConfig
where:
clusterName
is the name of the EKS cluster as returned byaws eks list-clusters
roleARN
is the ARN of the IAM role
tlsClientConfig
where:
insecure
isfalse
. This might not be required but it doesn't hurt to be explicit.caData
is the certificate returned byaws eks describe-cluster --query "cluster.certificateAuthority" --output text
. It should already be base64 encoded.when creating an app with
argocd app create
, set--dest-server
to the URL of the Kubernetes API for the cluster.
@helenabutowatcisco just to see If I understand: on step 6 you added to the aws auth cm running on account B the same role you created on account A with and annotated to the argo service roles?
I recently implemented this, the above comments and references to the documentation helped me a lot. It still took me quite some time to get a working setup.
I wrote a tutorial describing a basic setup that works and is fairly flexible: https://www.modulo2.nl/blog/argocd-on-aws-with-multiple-clusters
Hope this helps someone.
I recently implemented this, the above comments and references to the documentation helped me a lot. It still took me quite some time to get a working setup.
I wrote a tutorial describing a basic setup that works and is fairly flexible: https://www.modulo2.nl/blog/argocd-on-aws-with-multiple-clusters
Hope this helps someone.
This was extremely helpful. There needs to be clearer examples of this in the documentation.
Will ahh the above, I'm still getting the server has asked for the client to provide credentials
. Has anything changed since then?
In my case, the problem was that only the ArgoCD server (argocd-server
pod) had access to the IAM role. It turns out that the application controller (argocd-application-controller
pod) also needs to have access to the IAM role for it to work.
I'm happy to came across this thread. Without that issue you'll probably burn day(s) to figure out how this works. The documentation is lacking detailed information here. If you have all the information in your kubeconfig and you can just add an external cluster by pointing to the correct context everything is fine, if you need to fiddle around with EKS you are simply lost. Thanks to everyone who contributed here :)
I recently implemented this, the above comments and references to the documentation helped me a lot. It still took me quite some time to get a working setup.
I wrote a tutorial describing a basic setup that works and is fairly flexible: https://www.modulo2.nl/blog/argocd-on-aws-with-multiple-clusters
Hope this helps someone.
Hi, when I created the argo app, the following error occurred: getting credentials: exec: executable aws failed with exit code 255
Must argocd be installed to aws eks?
In the v2.2.0 version, use https://argo-cd.readthedocs.io/en/stable/operator-manual/declarative-setup/#clusters, the test is available, and the operation is also very simple
The awsAuthConfig field is not required
I recently implemented this, the above comments and references to the documentation helped me a lot. It still took me quite some time to get a working setup. I wrote a tutorial describing a basic setup that works and is fairly flexible: https://www.modulo2.nl/blog/argocd-on-aws-with-multiple-clusters Hope this helps someone.
Hi, when I created the argo app, the following error occurred: getting credentials: exec: executable aws failed with exit code 255
Must argocd be installed to aws eks?
@woniuzhang you'll need to create a token for the connection to work. Also, ArgoCD does not need to be on the actual cluster you're deploying to, however you will need to add the cluster and the appropriate namespace to it in order to establish the correct connection.
I recently implemented this, the above comments and references to the documentation helped me a lot. It still took me quite some time to get a working setup. I wrote a tutorial describing a basic setup that works and is fairly flexible: https://www.modulo2.nl/blog/argocd-on-aws-with-multiple-clusters Hope this helps someone.
Hi, when I created the argo app, the following error occurred: getting credentials: exec: executable aws failed with exit code 255 Must argocd be installed to aws eks?
@woniuzhang you'll need to create a token for the connection to work. Also, ArgoCD does not need to be on the actual cluster you're deploying to, however you will need to add the cluster and the appropriate namespace to it in order to establish the correct connection.
@JasonKAls Thank you for your reply. Using https://argo-cd.readthedocs.io/en/stable/operator-manual/declarative-setup/#clusters is very simple, I test it in v2.2.0 version, I'm not sure if the lower version is also feasible.
Hi!
I'm using ArgoCD version 2.2.1 and this is the solution I managed to implement with Terraform on an EKS 1.21 cluster:
I am using the Terraform Provider Kubernetes 2.7.x with the addition of _v1
in the resources and data.
If you use a version lower than 2.7 remove all the _v1
in each resource name and data
Target Kubernetes Cluster
resource "kubernetes_service_account_v1" "argocd_manager" {
metadata {
name = "argocd-manager"
namespace = "kube-system"
}
}
resource "kubernetes_cluster_role_v1" "argocd_manager" {
metadata {
name = "argocd-manager-role"
}
rule {
api_groups = ["*"]
resources = ["*"]
verbs = ["*"]
}
rule {
non_resource_urls = ["*"]
verbs = ["*"]
}
}
resource "kubernetes_cluster_role_binding_v1" "argocd_manager" {
metadata {
name = "argocd-manager-role-binding"
}
role_ref {
api_group = "rbac.authorization.k8s.io"
kind = "ClusterRole"
name = kubernetes_cluster_role_v1.argocd_manager.metadata.0.name
}
subject {
kind = "ServiceAccount"
name = kubernetes_service_account_v1.argocd_manager.metadata.0.name
namespace = kubernetes_service_account_v1.argocd_manager.metadata.0.namespace
}
}
data "kubernetes_secret_v1" "argocd_manager" {
metadata {
name = kubernetes_service_account_v1.argocd_manager.default_secret_name
namespace = kubernetes_service_account_v1.argocd_manager.metadata.0.namespace
}
}
ArgoCD Kubernetes Cluster
data "aws_eks_cluster" "this" {
name = "CLUSTER_NAME"
}
resource "kubernetes_secret_v1" "cluster" {
metadata {
name = "CUSTOM_NAME"
namespace = "argocd"
labels = {
"argocd.argoproj.io/secret-type" = "cluster"
}
}
data = {
name = "CLUSTER_NAME"
server = data.aws_eks_cluster.this.endpoint # AWS EKS target cluster endpoint
config = jsonencode({
bearerToken = data.kubernetes_secret_v1.argocd_manager.data.token
tlsClientConfig = {
insecure = false
caData = data.aws_eks_cluster.this.certificate_authority[0].data # base64 AWS EKS Certificate Authority
}
})
}
type = "Opaque"
}
I merely added the service role of admin (where argocd is running) to the cluster that Argo wants to "manage". This worked, but need to get opinion on "best security practice" here..
- groups:
- system:bootstrappers
- system:nodes
rolearn: arn:aws:iam::<account>:role/<cluster-name>-cluster-ServiceRole-B4UFCVZ99KE8
username: system:node:{{EC2PrivateDNSName}}
I follow this tutorial : https://www.modulo2.nl/blog/argocd-on-aws-with-multiple-clusters (With my own EKS cluster)
But i have this error :
[argo-cd-argocd-server-58c46798d5-tjm7j] An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts::123456789:assumed-role/node_group-eks-node-group-20220328141247459765646001/i-0fe964864900d3e3 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::123456789:role/Deployer
Just some quick notes on setting this up while sifting through a lot of confused comments above.
- The role you create (roleA) doesn't need any permissions. It just needs the trust policy so that argocd can assume it just like any IRSA role. And yes the service accounts on both the application-controller and the server need to be able to assume the role. That's enough for argocd to generate eks credentials - just "be" roleA. It doesn't need any attached policies.
- You can then put roleA into the target cluster's aws-auth map as a role with
system:masters
in its groups. - The roleArn=roleB attribute can be left out if you don't want argocd to assume roleB after assuming roleA. Sometimes people make a separate role in the target account that has access to the cluster, and so you have to assume that first - but it's not required. For example in the blog post they call it the Deployer role.
- Finally argocd doesn't show any status for the cluster connection until you actually deploy an app into it which is a bit weird.
https://github.com/argoproj/argo-cd/issues/2347#issuecomment-662139989 is a great guide, and hopefully I've answered some of the ambiguities in there in a couple spots that had a bit of doubt.
we had the below config working great
config: '{"tlsClientConfig":{"insecure":false,"caData":"XX"},"awsAuthConfig":{"clusterName":"<name in EKS console>","roleARN":"arn:aws:iam::00000000:role/argocd-role"}}'
name: <descriptive name for this cluster>
server: https://XXXXXXXX.gr7.<region>.eks.amazonaws.com
but this pr appears to have introduced a new CLI for auth without any docs for how to update the config for EKS. Still experimenting and will post back if we can figure out new config.