aws-efs-csi-driver icon indicating copy to clipboard operation
aws-efs-csi-driver copied to clipboard

EFS CSI driver should pass the IAM flag (for IRSA) to EFS Utils for app-specific filesystem policies

Open esalberg opened this issue 4 years ago • 29 comments

Is your feature request related to a problem?/Why is this needed We are unable to segregate EFS filesystems based on policy to different EKS pods. While we can pass a specific role to the EFS CSI driver itself, it is always the same role.

We need to limit EFS filesystem access per pod.

/feature

Describe the solution you'd like in detail EFS Utils has added support for AssumeRoleWithWebIdentity for IRSA: https://github.com/aws/efs-utils/pull/60

However, the EFS CSI driver does not appear to pass the IAM flag to EFS Utils.

"[AWS Support] was able to replicate your finding of not being able to use IRSA (IAM Roles for Service Accounts) with EFS Filesystem Policies and Access Points. My testing environment involved setting up a fresh EKS cluster, with IRSA provisioned, and an EFS instance with two access points. The filesystem policy was created so that two unique IAM roles were allowed to access one of the access points, and two pods were spun up in the cluster with a role associated to each. Without filesystem policy in place, the driver worked as expected and was able to mount the access points required. However, when the filesystem policy was attached, the efs-csi-node driver was unable to complete the volume mount.

The issue seems to stem from the EFS CSI driver not passing the proper IAM flag to the underlying efs-utils package that the driver utilizes. This has been reported on the CSI driver repository, as well as the AWS container roadmap as seen here and here. EFS Utils requires passing the flag -o iam to the mount command to properly pass IAM roles for authorization. As the EFS CSI driver is solely responsible for fulling Kubernetes Persistent Volume requests, it does not currently read Kubernetes Service Account information from pods to facilitate authorization.

You can see documentation for the IAM flag in the announcement blog post for EFS Access Points here. An example working mount command would be as follows: $ sudo mount -t efs -o iam,tls,accesspoint=fsap- fs- /mnt/shared

From my testing, looking at the logs of the EFS CSI node driver, you can see the mount command being passed here, without the required IAM flag: Mounting arguments: -t efs -o accesspoint=fsap-,tls fs-:/ /var/lib/kubelet.pods/dde3c956-2d4e-4e80-b656-dcb282e6289c/volumes/kubernetes.io~csi/efs-pv-2/mount

Output: Could not start amazon-efs-mount-watchdog, unrecognized init system “aws-efs-csi-dri”

mount.nfs4: access denied by server while mounting 127.0.0.1:/

Since the flag is not passed, the driver does not try to read any IAM role, and is therefore unable to assume the role required to mount the access point with policy attached. The relevant code that is called inside the driver is located here, where the flag is not passed: https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/b9d26737c3f8659403eaf753a61c7874ca4a8b39/pkg/driver/node.go#L114

Describe alternatives you've considered We are segregating apps into separate EKS clusters if they need EFS access.

Additional context Container roadmap link: https://github.com/aws/containers-roadmap/issues/1003

esalberg avatar Nov 13 '20 19:11 esalberg

We have to wait for this feature https://github.com/kubernetes/enhancements/issues/2047 . Otherwise there is no way for the driver to get a pod's creds and use the pod's creds for mounting.

wongma7 avatar Nov 13 '20 19:11 wongma7

We have to wait for this feature kubernetes/enhancements#2047 . Otherwise there is no way for the driver to get a pod's creds and use the pod's creds for mounting.

It appears that this dependent feature entered Beta two weeks ago, for possible realease with v1.21. Given the status change, is there any timeline for adding the above feature to the driver?

garystafford avatar Jan 29 '21 19:01 garystafford

Do we have updates on this issue ? there's a workaround for getting the EFS filesystem access per pod ?

bssantiago avatar Apr 14 '21 15:04 bssantiago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

fejta-bot avatar Jul 13 '21 16:07 fejta-bot

Bumping this issue - are there any updates on implementing it in the roadmap?

sbreingan avatar Jul 13 '21 16:07 sbreingan

FYI - It looks like the Kubernetes upstream requirement finally is being merged for k8s 1.22. Not sure when AWS will get 1.22, but hopefully then we can get a date for the efs csi driver update?

esalberg avatar Jul 22 '21 14:07 esalberg

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Aug 21 '21 14:08 k8s-triage-robot

/remove-lifecycle rotten

From: Kubernetes Triage Robot @.> Reply-To: kubernetes-sigs/aws-efs-csi-driver @.> Date: Saturday, August 21, 2021 at 10:25 AM To: kubernetes-sigs/aws-efs-csi-driver @.> Cc: Elyse Salberg @.>, Author @.***> Subject: Re: [kubernetes-sigs/aws-efs-csi-driver] EFS CSI driver should pass the IAM flag (for IRSA) to EFS Utils for app-specific filesystem policies (#280)

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Please send feedback to sig-contributor-experience at kubernetes/communityhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_kubernetes_community&d=DwMCaQ&c=vYgbkq6q_4hmkgETKofs3A&r=dYP_6b0bY8kFzOVJ7ZLYv2UB6hf7GylA08dLczMmM-w&m=mNtxjhVhVGM5g9RAr6IYUUADhv0mGPGN3FpB1vgWwOA&s=izp3onfzIgqC-cZOwOiq3_zsWkZ04rXjvBqfZVspBAk&e=.

/lifecycle rotten

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_kubernetes-2Dsigs_aws-2Defs-2Dcsi-2Ddriver_issues_280-23issuecomment-2D903124143&d=DwMCaQ&c=vYgbkq6q_4hmkgETKofs3A&r=dYP_6b0bY8kFzOVJ7ZLYv2UB6hf7GylA08dLczMmM-w&m=mNtxjhVhVGM5g9RAr6IYUUADhv0mGPGN3FpB1vgWwOA&s=HUE6JxwHzSOFy18R1lRXWxbpGWoBJc18v9SSEpeWyCA&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ACHPV4TPFSK6DXMM7M2VM5TT56ZO5ANCNFSM4TU6GSKA&d=DwMCaQ&c=vYgbkq6q_4hmkgETKofs3A&r=dYP_6b0bY8kFzOVJ7ZLYv2UB6hf7GylA08dLczMmM-w&m=mNtxjhVhVGM5g9RAr6IYUUADhv0mGPGN3FpB1vgWwOA&s=sfyoWDWd0y-vIvzpPz6LUCq8xg60ujA2f8nrDpRXA00&e=. Triage notifications on the go with GitHub Mobile for iOShttps://urldefense.proofpoint.com/v2/url?u=https-3A__apps.apple.com_app_apple-2Dstore_id1477376905-3Fct-3Dnotification-2Demail-26mt-3D8-26pt-3D524675&d=DwMCaQ&c=vYgbkq6q_4hmkgETKofs3A&r=dYP_6b0bY8kFzOVJ7ZLYv2UB6hf7GylA08dLczMmM-w&m=mNtxjhVhVGM5g9RAr6IYUUADhv0mGPGN3FpB1vgWwOA&s=4qa6Q3iP_KxbqKuaQlyfAtkMWJQtQOJpDDbMF7aerWU&e= or Androidhttps://urldefense.proofpoint.com/v2/url?u=https-3A__play.google.com_store_apps_details-3Fid-3Dcom.github.android-26utm-5Fcampaign-3Dnotification-2Demail&d=DwMCaQ&c=vYgbkq6q_4hmkgETKofs3A&r=dYP_6b0bY8kFzOVJ7ZLYv2UB6hf7GylA08dLczMmM-w&m=mNtxjhVhVGM5g9RAr6IYUUADhv0mGPGN3FpB1vgWwOA&s=DZV8eCke7ZHiMXvCXRkjenPfl3_zc3MiDOHCOQrbOE4&e=.

This message is intended for the recipient only and is not meant to be forwarded or distributed in any other format. This communication is for informational purposes only. It is not intended as an offer or solicitation for the purchase or sale of any financial instrument, or security, or as an official confirmation of any transaction. Putnam does not accept purchase or redemptions of securities, instructions, or authorizations that are sent via e-mail. All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect those of Putnam Investments, LLC (DBA Putnam Investments) and its subsidiaries and affiliates. If you are not the intended recipient of this e-mail, please delete the e-mail. Information on how Putnam gathers and uses personal data is available in Putnam’s privacy notice which can be found at https://www.putnam.com/policies

esalberg avatar Aug 23 '21 14:08 esalberg

/remove-lifecycle rotten

vchepkov avatar Aug 23 '21 14:08 vchepkov

I feel like adding -o iam would be useful even without CSIServiceAccountToken.

Right now, according to CloudTrail, the file system is being mounted as ANONYMOUS_PRINCIPAL:

{
    "userIdentity": {
        "type": "AWSAccount",
        "principalId": "",
        "accountId": "ANONYMOUS_PRINCIPAL"
    },
    "eventTime": "2021-11-11T16:41:04Z",
    "eventSource": "elasticfilesystem.amazonaws.com",
    "eventName": "NewClientConnection",
    ...
}

If -o iam was included, I believe we would arrive as the role associated with the efs-csi-node-sa service account, and could write a policy preventing the EFS from being mounted except via the CSI.

Would a PR to do this be accepted?

ghost avatar Nov 11 '21 17:11 ghost

To have the node daemonset use its IAM role when mounting the file system you can add iam to the mountOptions of the StorageClass or PersistentVolume:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: efs-sc
provisioner: efs.csi.aws.com
mountOptions:
  - tls
  - iam
parameters:
  provisioningMode: efs-ap
  fileSystemId: fs-012345678901010
  directoryPerms: "700"
  gidRangeStart: "1000"
  gidRangeEnd: "2000"
  basePath: "/dynamic_provisioning"

This then gets added to the mount command.

However this will use the same role for all pods on the cluster. As mentioned it would be really nice if you could inject in a different IAM role for each PV/PVC.

sedan07 avatar Dec 23 '21 18:12 sedan07

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Mar 23 '22 19:03 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Apr 22 '22 19:04 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-triage-robot avatar May 22 '22 20:05 k8s-triage-robot

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar May 22 '22 20:05 k8s-ci-robot

/reopen

lmouhib avatar May 22 '22 20:05 lmouhib

@lmouhib: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar May 22 '22 20:05 k8s-ci-robot

/remove-lifecycle rotten

vchepkov avatar May 23 '22 14:05 vchepkov

/reopen

vchepkov avatar May 23 '22 14:05 vchepkov

@vchepkov: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar May 23 '22 14:05 k8s-ci-robot

Could anybody from maintainers re-open this ticket, please?

vchepkov avatar May 23 '22 14:05 vchepkov

#710 This PR should resolve these problems I think

jonathanrainer avatar Jun 15 '22 05:06 jonathanrainer

We are interested in this - any updates please?

adeweetman-al avatar Sep 12 '22 13:09 adeweetman-al

This is now solved and is part of release 1.4.1

lmouhib avatar Sep 26 '22 20:09 lmouhib

I have resolved the issue with mount.nfs4: access denied by server while mounting 127.0.0.1:/ by adding iam to mountOptions on a StorageClass.

I found https://github.com/kubernetes-sigs/aws-efs-csi-driver/pull/422 which got me the idea to check the option for mount-time options and there is no occurrence of iam in https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/1da6b2f50578ca5315294404c79470e060ddf51b/pkg/driver/node.go#L51-L200

Any idea whether it is intentional?

Before I had that I needed a policy allowing anything on AWS to mount the EFS:

{
    "Sid": "CSIMountAccess",
    "Effect": "Allow",
    "Principal": {
        "AWS": "*"
    },
    "Action": [
        "elasticfilesystem:ClientWrite",
        "elasticfilesystem:ClientMount"
    ],
    "Resource": "arn:aws:elasticfilesystem:eu-north-1:XXXXXX:file-system/fs-0f0b08aba4527c97e",
    "Condition": {
        "Bool": {
            "elasticfilesystem:AccessedViaMountTarget": "true"
        }
    }
}

after adding iam mountOption it started working properly:

{
    "Sid": "CSIMountAccess",
    "Effect": "Allow",
    "Principal": {
        "AWS": "arn:aws:iam::XXXXXX:role/infra-test-12345-efs-csi-node"
    },
    "Action": [
        "elasticfilesystem:ClientWrite",
        "elasticfilesystem:ClientMount"
    ],
    "Resource": "arn:aws:elasticfilesystem:eu-north-1:XXXXXX:file-system/fs-0f0b08aba4527c97e",
    "Condition": {
        "Bool": {
            "elasticfilesystem:AccessedViaMountTarget": "true"
        }
    }
}

nazarewk avatar Jan 04 '23 10:01 nazarewk

@nazarewk Hi, "arn:aws:iam::XXXXXX:role/infra-test-12345-efs-csi-node" is your serviceaccount binding role or nodergroup role ?

Bruce-Lu674 avatar Mar 23 '23 01:03 Bruce-Lu674

@nazarewk Hi, "arn:aws:iam::XXXXXX:role/infra-test-12345-efs-csi-node" is your serviceaccount binding role or nodergroup role ?

it's a service account, shouldn't matter though

nazarewk avatar Mar 23 '23 07:03 nazarewk

/kind feature

RyanStan avatar May 15 '23 14:05 RyanStan

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 20 '24 10:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Feb 19 '24 11:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Mar 20 '24 11:03 k8s-triage-robot