aws-efs-csi-driver
aws-efs-csi-driver copied to clipboard
Dynamic provisioning seems to ignore subPathPattern
/kind bug
I've implemented the dynamic access point provisioning approach as outline here: https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/master/examples/kubernetes/dynamic_provisioning/README.md, however, I can't seem to get the subPathPattern parameter to have any effect. The access points are created successfully when I create a new PVC, however they are at a path in EFS of just the PVC's volume name:
My storage class is configured as follows:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: efs
parameters:
directoryPerms: "700"
ensureUniqueDirectory: "False"
fileSystemId: <efs_filesystem_id>
provisioningMode: efs-ap
subPathPattern: ${.PVC.namespace}/${.PVC.name}
provisioner: efs.csi.aws.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
A sample PVC I created is:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: app
namespace: baseline
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
storageClassName: efs
With this I would expect the path in EFS to be /baseline/app but instead it is /pvc-51402212-7b56-4841-bab5-9b34bdb31b4f (the dynamically provisioned volume name).
Hi @laconictae, are you using the v1.7.0? Because this new feature was released in v1.7.0.
Hello @mskanth972 and thanks for the response! As it happens I was on v1.6.0 and upgraded to v1.7.0 late yesterday afternoon which did (after a lot of trial and error with ensureUniqueDirectory and reuseAccessPoint) get me where I wanted with the path in EFS. However, I now have a new problem which I am wondering if there is any workaround for or if I'm just not really understanding the paradigm here...
I have paths in EFS like the following:
/namespace_a/application_a
/namespace_a/application_b
/namespace_b/application_a
The issue that I'm now running into is the access points get created successfully, and for a POSIX user/group ID from the pool specified by my parameters on the storage class (gidRangeStart, gidRangeEnd). All well and good. However, I find that the mounted volume on my pods is mounted with completely wrong permissions. For instance, Application A in Namespace A might have a dynamically provisioned access point with POSIX user 6999999. My pod for Application A in Namespace B although it clearly references a different dynamically provisioned access point with POSIX user 6999998, in the pod you can see the volume mounted as owned by 6999999.
I was completely confused about this and I am not entirely sure it's not a bug in EFS. I manually mounted the access points on a node to try to explore the permissions, and although I have 3 separate access points with different POSIX users (6999998, 6999999, 7000000), they all mounted owned by user 6999999. This results in my pods not having permission to the mounted volume, which is obviously not ideal.
Interestingly, at one point with the provisioner crashing due to some idiocy on my part, I had a queue of PVCs trying to be created which all got provisioned simultaneously when I corrected the provisioner configuration. It created 3 access points in EFS all with the same POSIX user despite the three separate paths. In that particular case, all the pods were happy, all had access to their mounted volume. However, I can't control the POSIX user which is used for the access points, I can only provide a range.
I am still digging at this point and trying to understand how this stuff even works, I will admit that this is area is really not my forte so I am bumbling around a lot, but I am starting to wonder if what I'm trying to do is an anti-pattern and I ought to just admit defeat and use the static access point approach instead.
Hi @laconictae I am not sure I understand the issue you are experiencing. Is the issue that you would like to mount a volume with a particular UID but you aren't getting the UID you expect?
Could you please provide the storage class you are using as well as a list of steps to recreate the issue?
Hello @seanzatzdev-amazon - I don't really mind which UID is used, but I was seeing pods mounting EFS volumes which exec-ing into the pod revealed were mounted with permissions for a different POSIX user than the access point specified, resulting in me getting permission denied errors just trying to ls the mounted volume inside the pod.
I feel with the various things I was trying with dynamic provisioning, I might have put my EFS filesystem in a bad state. I created a new EFS filesystem and re-provisioned access points there, that has been working (seemingly) fine so far. I do see some weirdness when multiple access points get provisioned simultaneously, they end up with the same POSIX user permissions - but that doesn't break anything for me.
What I have set up now for the storage class:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: efs-dynamic
parameters:
directoryPerms: "750"
ensureUniqueDirectory: "False"
fileSystemId: <file_system_id>
gidRangeEnd: "7000000"
gidRangeStart: "6999001"
provisioningMode: efs-ap
reuseAccessPoint: "False"
subPathPattern: ${.PVC.namespace}/${.PVC.name}
provisioner: efs.csi.aws.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
This is to give me a predictable path within EFS - if I do not disable the unique directory setting, a GUID for the PVC is appended to the path and I don't seemingly have a way to get at that again if I re-deploy the application (the access point is removed, then a new access point is created with a new GUID). I tried playing around with reuseAccessPoint as true, but I found that the same access point was being used for two different PVCs in different namespaces and the provisioner just kept updating the access point based on which was deployed most recently.
So this seems to work, but what I'm worried about is getting into a mess again where due to removing and re-deploying applications, the provisioner ends up creating an access point with a previously used path, but with a different POSIX user than was originally used. I'm worried that will result in permission errors again. Maybe my storage class reclaimPolicy should be Retain?
Hi @laconictae Just to clarify, are you reporting an issue in the CSI Driver code? We use this GitHub Issues page to track things like community-reported bugs, and thus it may not be the best place to get support on this problem.
ensureUniqueDirectory is a storage class feature meant to prevent conflicting access point directory paths, so you may want to consider re-enabling this feature.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.