aws-efs-csi-driver icon indicating copy to clipboard operation
aws-efs-csi-driver copied to clipboard

EKS : Dynamic provisioning doesn't auto recover PV when EFS Mount target AZ goes down

Open Tejasvihuded opened this issue 3 years ago • 14 comments

Is your feature request related to a problem?/Why is this needed aws-efs-csi-driver dynamic provisioning do not auto recover k8s PV when EFS mount target AZ goes down.

I am trying to Mount Amazon EFS file system cross-account from Amazon EKS. I am using dynamic provisioning feature of aws-efs-csi-driver . I found an issue on High availability of EFS via Persistent Volume in EKS

Below is my setup: 1)Created standard EFS with 3 zone mount targets. EFS has mount targets on us-east-2a,us-east-2b and us-east-2c 2)Created k8s storage class using EFS Id, AZ value is NOT passed under parameters in storage class 3)Created PVC using storage class, PVC dynamically created PV 4)When we describe PV, EFS mount target AZ (IP) is randomly picked i.e us-east-2b mount target is used for PV 5)Created pod using PVC and pod is able to successfully mount EFS and able to write data into EFS using accesspoint 6)We are able to exec into pod and read the contents from EFS using volume mountpath of pod

Below is the Issue:

  • Now I went ahead and deleted the EFS mount target which is associated with us-east-2b to simulate the AZ down scenario.
  • Now when I describe PV, mounttargetip of PV is still pointing to deleted mount target i.e us-east-2b.
  • Pod is running ,able to exec into it, but not able to cd to the pod volume mountPath directory, command return nothing it is stuck

/feature

Describe the solution you'd like in detail My expectation is "PV should auto recover and reconnect to next available zone mount target i.e us-east-2a or us-east-2c" OR "I should be able to delete existing PV without deleting previously dynamically created access point and use the existing accesspoint in dynamic provisioning"

Describe alternatives you've considered For now I am planning to go with static provisioning, so that

  • I can create EFS and Accesspoint with required path on EFS
  • Create storage class with EFS Id and Accesspoint Id
  • Create PV using storage class
  • Create PVC using PV and use it in Pod

In static provisioning if one of EFS mount target AZ which is used in PV goes down, I can delete Pod, PVC and PV and recreate PV using same access point(using same storage class),with this at least my new pod can access my old files in EFS on accesspoint path.

Additional context Once EFS target mount AZ goes down and come up, the mount target IP will change and my existing PV becomes useless as its still pointing to old IP address of mount target. Since I am using Dynamic provisioning , if I want to delete the stuck volume and recreate PVC-->PV there is another issue here.

  • Whenever I delete PVC, it deletes it respective PV and accesspoint and all my data in old accesspoint is no more reachable.
  • Every time I create PVC it create new access point with the name using dynamically generated PV name suffix.
  • Because of this I am not able to use the old accesspoint previously dynamically created before mount target AZ going down.

I need a solution where either "PV should auto recover and reconnect to next available zone mount target" OR "I should be able to delete existing PV without deleting previously dynamically created access point and use the existing accesspoint in dynamic provisioning"

Tejasvihuded avatar Sep 29 '22 09:09 Tejasvihuded

Hi @Tejasvihuded, deleting an Access Point does not delete any data stored on the EFS file system. (There is some background on this decision here.)

Given this, you could specify basePath in the storage class parameters, such that all newly created Access Points created using dynamic provisioning access data under the same path on EFS.

Would this work for your use case?

DanielRubinstein avatar Dec 08 '22 21:12 DanielRubinstein

Hi @DanielRubinstein ,Thanks for looking into this. 1)I agree on your first point "Deleting an Access Point does not delete any data stored on the EFS file system" 2)But on specifying basePath in the storage class parameters,does not help me. Below is why it doesn't help.

  • If I have given basePath: "/dynamic_provisioning" in storage class parameter, when I create PVC ,it creates access point root directory under "/dynamic_provisioning".

  • I see path of access point like "/dynamic_provisioning/pvc-48d6b005-3baa-4efc-8664-52f8f297772f" created in EFS. So now my pod writes some data which will go to this path.

  • Now I deleted PVC and created new PVC ,now I see the path of access point changed to "/dynamic_provisioning/pvc-2cac4519-5d46-4d2f-b60f-db0a9be12f5f",it creates path using the basepath I pass in storage class + PV name

  • Since the access point path are different in first and second, my pod using second access point cannot read/write to first access point root directory. Since the PVC keeps on changing ,the access point path will keep on changing, is there a way I can point to same folder on efs where the data is read and written even though access point root path is different for every PVC.

Tejasvihuded avatar Dec 13 '22 08:12 Tejasvihuded

Hello, Same issue on my side, as accessPointsOptions.DirectoryPath concatenates basePath + volName :

As stated on Controller.go, line 236 to 241:

	rootDirName := volName
	rootDir := basePath + "/" + rootDirName
	[...]
	accessPointsOptions.DirectoryPath = rootDir

I ended up creating my own AccessPoint, to avoid having the PV name in the file path --> I am not sure why volName was added to the path, but I think it should be removed to let users manage EFS directories properly.

Then you can follow the AccessPoints example to create the PV & PVC.

  csi:
    driver: efs.csi.aws.com
    volumeHandle: [FileSystemId]::[AccessPointId]

Also, do not create an empty StorageClass with efs.csi.aws.com provisionner as written in the example. Instead ensure storageClassName is not set (storageClassName: "") or it will end up with errors on EFS controller side (StorageClass of type efs.csi.aws.com cannot be created with no parameters).

guilr avatar Dec 18 '22 10:12 guilr

Thanks for providing info/details on the issue!

DNS lookup is done at mount time, not periodically, meaning the clients would stay mounted to IP address they got when it was mounted, when MT is removed, DNS look up will fail as the MT is deleted.

In driver, with current design, we need volumeName input to create client token, string of up to 64 ASCII characters that Amazon EFS uses to ensure idempotent creation when creating Access Points (https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/564867770adeabf9a2fbf890af131e56256212b0/pkg/cloud/cloud.go#L160).

For temporary workaround, we could use @guilr provided here, https://github.com/kubernetes-sigs/aws-efs-csi-driver/issues/775#issuecomment-1356767951

We will capture this scenario in github Dynamic Provisioning doc section.

@Tejasvihuded For mocking AZ down, would we be able to do so by editing security group and block all traffic? in this way we do not need to manually delete any MT and impact current usage

Ashley-wenyizha avatar Dec 29 '22 21:12 Ashley-wenyizha

Hi @DanielRubinstein ,Thanks for looking into this. 1)I agree on your first point "Deleting an Access Point does not delete any data stored on the EFS file system" 2)But on specifying basePath in the storage class parameters,does not help me. Below is why it doesn't help.

  • If I have given basePath: "/dynamic_provisioning" in storage class parameter, when I create PVC ,it creates access point root directory under "/dynamic_provisioning".
  • I see path of access point like "/dynamic_provisioning/pvc-48d6b005-3baa-4efc-8664-52f8f297772f" created in EFS. So now my pod writes some data which will go to this path.
  • Now I deleted PVC and created new PVC ,now I see the path of access point changed to "/dynamic_provisioning/pvc-2cac4519-5d46-4d2f-b60f-db0a9be12f5f",it creates path using the basepath I pass in storage class + PV name
  • Since the access point path are different in first and second, my pod using second access point cannot read/write to first access point root directory. Since the PVC keeps on changing ,the access point path will keep on changing, is there a way I can point to same folder on efs where the data is read and written even though access point root path is different for every PVC.

I am having the same issue in a different scenario, that is Multi-region DR. I am trying to setup a process to recover from a disater in a region by spinning up another EKS cluster in another region. The EFS File system is replicated in the secondary region. When I spin up the pods in the secondary cluster I find myself in the same situation, where the new PVC have different IDs from the original ones, so the pods do not access to previous data. I can confirm that a fully static provisioning, like the one suggested by @guilr works, but unfortunately in my case having a fully static approach is a no-go.

Is it possible to enhance the driver in a way where the name of the folder created on EFS can be customized? Even chosing the PVC name instead of UID could be sufficient in this kind of scenarios.

Thanks

vale21 avatar Jan 20 '23 14:01 vale21

Hi @vale21 In the above example "48d6b005-3baa-4efc-8664-52f8f297772f" is the volume name input from kubernetes. This volume name is unique and used for making sure the createvolume request is idempotent. So changing the target name does not solve your problem as the decision to create a new access point or map it to the existing one is dependent on the volume name. It is used as the client token for the efs access point create api. Please see below references. Your use case looks more suitable for static provisioning. https://github.com/container-storage-interface/spec/blob/master/spec.md#createvolume https://docs.aws.amazon.com/efs/latest/ug/API_CreateAccessPoint.html#efs-CreateAccessPoint-request-ClientToken https://www.weka.io/blog/cloud-storage/kubernetes-storage-provisioning/#:~:text=The%20main%20difference%20relies%20on,you%20go%20for%20dynamic%20provisioning.

tijumat avatar Jan 22 '23 18:01 tijumat

Thanks @tijumat, I get your point. I am "stretching" the concept of dynamic provisioning, because I am trying to dynamically provision a volume but having it point to some statically pre-existing data.

However, saving the volume content in a folder with the PVC UID brings to problems in all those situations where for some reason you lose the PVC object:

  • AZ down
  • Region down
  • Accidental deletion
  • Terraform doing apply/destroy
  • ...

In all those cases you find yourself in a dodgy situation where your EFS File system still contains your data, which makes it theoretically available and recoverable, but practically it is very difficult to reach because it is bound to a PVC that does not exist any longer.

The workaround of statically provisioning a volume with an access point is good in some cases, but does not scale. In a cluster with tens or hundreds volumes, you will have to manually and statically create tens or hundreds access points to recover data on EFS, with the additional difficulty of the mapping. Because you have to know which folder belongs to which new PV, and the IDs of the old PVC are gone.

vale21 avatar Jan 24 '23 08:01 vale21

Thank @vale21 for adding more examples for dynamic provisioning.

@tijumat I agree with https://github.com/kubernetes-sigs/aws-efs-csi-driver/issues/775#issuecomment-1398476818

In dynamic provisioning If you give a way to pass name of the folder created on EFS it would be helpful. You can control this using flag in storage class whether to use PVC name or not. For now let us consider driver takes PVC name while creating folder on EFS:

1)I create PVC using storage class, if you consider PVC name as folder name then access point will be created with path "/baspath/pvcname" basepath here can taken from storage class 2)I used this PVC in my pod for sometime, and now EFS mount target goes down which was used in PV 3)Only way for me is to delete PVC which in turn delete PV, since its dynamic provisioning the access point is deleted but path and data is still available on EFS 4)Now I recreate PVC using same storage class, new access point is created with same old path i.e "/baspath/pvcname" 5)I use the new PVC in my pod, and my pod will be able to read/write the same file on EFS.

@tijumat let me know your thoughts.

Tejasvihuded avatar Jan 27 '23 12:01 Tejasvihuded

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Apr 27 '23 12:04 k8s-triage-robot

/remove-lifecycle stale

vale21 avatar Apr 27 '23 13:04 vale21

/kind feature

RyanStan avatar May 15 '23 14:05 RyanStan

Hi, Any update on whether this feature will be available, i.e using pvc name as folder name or user defined folder name as accesspoint path value. If so do we have any timelines when this will be available

Tejasvihuded avatar Jun 06 '23 06:06 Tejasvihuded

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 21 '24 23:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Feb 20 '24 23:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Mar 22 '24 00:03 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Mar 22 '24 00:03 k8s-ci-robot