aws-efs-csi-driver
aws-efs-csi-driver copied to clipboard
Allow more control of the name of the directory created by the Access Point under Dynamic Provisioning
This is a continuation of #568, as that PR was opened incorrectly and messing around with it was much more trouble than just opening a new PR
Is this a bug fix or adding new feature? This adds a new feature, mostly in response to discussions on Issue https://github.com/kubernetes-sigs/aws-efs-csi-driver/issues/543
What is this PR about? / Why do we need it?
At the moment if you use dynamic provisioning the name of the folder created on the EFS is the name of the generated PV however in some use cases this isn't desirable and leads to a lot of confusion. Under this scheme the name of the directory can be controlled by users using a new parameter subPathPattern. This allows users a higher level of control over the directories that are created as they can specify any combination of fixed strings and a set of predefined variables based on what the CSI Spec supports. To stop this causing problems with security, there is additionally a new featureFlag added called ensureUniqueDirectories
which appends a UUID to the created structure so clashes will be avoided. This is true by default but allows users to explicitly opt into an unsafe mode if they wish to provide the guarantee that there will be not be clashes themselves rather than having the driver enforce it for them.
What testing is done? I have added to the unit tests for the module and I believe this is sufficient because all that is being changed is the directory being added to the AccessPointOpts struct. There might want to be some E2E testing about what happens if that directory already exists but I'm happy to take pointers on that as this is my first PR to the repo.
fixes #543
Hi @jonathanrainer. Thanks for your PR.
I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test
on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test
label.
I understand the commands that are listed here.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Right, sorry about the PR wrangling, but now we're back I can continue the conversation from #568.
@iAnomaly, @wolffberg - This is where the PR continues. @wolffberg I've addressed your points about validating the created EFS path by explicitly validating it before we send it off the API and have added tests to that effect.
@wongma7 - I had a long think about your comments re: security and think I've come up with a compromise. My general view is always that things should be secure by default but should allow you to turn off those controls in the knowledge that you're explicitly opting in to having to manage those guarantees yourself. So to that end I've introduced another parameter ensureUniqueDirectories
which adds a UUID to the end of any directory path specified to ensure that directories are unique. This is on by default so I imagine most users won't notice. You can turn this off to get the behaviour originally specified but throughout the README's and examples it makes it clear that then you are opting in to the responsibility to manage the guarantees yourself. This way it's not a surprise for users and it would require you to make a bad change in two places for something bad to happen.
I hope this is satisfactory and if not I'm more than happy to have a discussion about it. Further I realise some of the Go may not be 100% correct as I'm fairly new to the language and some of the most common idioms still elude me so feel free to comment away but hopefully this is a much better proposition than that proposed previously.
This is awesome @jonathanrainer, really appreciate all your time on this and all the cases you covered with tests. I went looking for the omitted basePath
, empty subPathPattern
case (which achieves the AWS EFS Console default behavior of just /
as the root path for the Access Point) and sure enough you've got that in there.
Thanks again, this will be an awesome flexible feature once merged.
/assign wongma7
/ok-to-test
I will not have time to review for a while but in principle yes I want this merged, thanks for patience
I've run into a use case where these changes would be highly desirable. Thanks team!
Just to prove this actually works, I spun this up on an EKS cluster with the following StorageClass
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
creationTimestamp: "2022-06-12T15:46:35Z"
labels:
provisioning-type: dynamic
name: efs-dynamic
resourceVersion: "16864"
uid: 983866ac-c731-45ac-a22d-1d1edaf7dbda
parameters:
basePath: ""
directoryPerms: "777"
ensureUniqueDirectory: "false"
fileSystemId: fs-05427a7b4af6c1fcc
gidRangeEnd: "2000"
gidRangeStart: "1000"
provisioningMode: efs-ap
subPathPattern: /dynamic/${.PVC.name}/foo
provisioner: efs.csi.aws.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
Which then generated the following on disk:
So my Unit Test predictions were good and it also showed the value of all the error handling as I got the StorageClass wrong a few times and every time my error handling directed me to the issue quickly.
Hey @jonathanrainer
Here at YData we are testing your solution, and fix a big issue for our problem, so thank you and hope this PR get merged soon in order to use the official version.
Our storage class looks like this:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: sc-new
provisioner: efs.csi.aws.com
parameters:
provisioningMode: efs-ap
fileSystemId: fs-xxxxx
directoryPerms: "0770"
gid: 1000
uid: 1000
subPathPattern: "/efs"
ensureUniqueDirectory: "false"
reclaimPolicy: Retain
volumeBindingMode: Immediate
About the issue: https://github.com/kubernetes-sigs/aws-efs-csi-driver/issues/715
We are testing, and we notice that new access points are created each time. For example, here where we have multiple PVC's and each PVC using the same directory.
About this issue: https://github.com/kubernetes-sigs/aws-efs-csi-driver/issues/516
We found out, and maybe is worthy of put in the docs, is that with SubPath we can't write to the root path:
SubPaths in the pods will not work with:
subPathPattern: "/"
ensureUniqueDirectory: "false"
But after setting to subPathPattern: "/efs"
or any other path everything works great.
Hi @andre-lx, thanks for the very kind words,
With regards your first point, I'd never really considered that access points could be re-used, but you make a very fair point. I'll need to be a bit careful about that intersects with other features, for example if you use the GID range features then you would need a new access point each time because each one would be for a different GID but I don't think it's beyond the realms of possibility to check that is the case, it's also probably a case where adding an E2E test would be a good idea so thanks for spotting that.
In regards the second point that seems like a bug/gap in my unit testing. I'm going to have a good look at that and hopefully they'll be a fix fairly soon. I'm quite busy this week but hopefully will get a chance to look over this at the weekend and will post the results of my findings when I have them.
Thanks so much for taking an interest, and I look forward to getting to the bottom of these issues.
Hi @andre-lx,
I've looked into both the issues you've raised and I'll respond to them in turn:
- With the issue around setting
subPathPattern: "/"
ensureUniqueDirectory: "false"
I've run some tests and it appears the code is operating exactly as I'd expect, i.e. the request to AWS requests an accessPoint that's root directory is "/". What error are you seeing when this happens? And further what use case does this service? Setting the subPath pattern like that would simply allow you to gain access to the root of the directory, which I guess is useful, but if that's what you want then why not just create a PV/PVC statically and mount that PVC into multiple pods? It seems simpler and like it would accomplish what you want without needing this feature. This might also be helped by the work done in #732.
- With the issue around re-use of access points, I've had a look into this and I'm not sure it's a good idea, for one thing it introduces the need for concurrency controls and entails quite a lot of book-keeping on the drivers side to make sure the set of access points is correct with respect to the cluster. You have to guarantee that you can lock access to the AWS API so that two concurrent requests from a PVC end up sharing one and also you'd have to ensure that when one PVC got deleted the access point didn't disappear from the other PVCs that were using it.
However I think that the recent PR I've raised in #732 might help you out with the problem you're having, and that one combined with this one will give a similar kind of functionality without depleting the access points available. If this is something that would still be useful then it's probably something a future PR could tackle but I think it's scope creep for this current PR.
Hi @jonathanrainer, since you plan to work and finalize #732, do you plan to finalize this PR as well? That would be a very nice evolution
@ChamMach Yer that's the plan, I want to get both merged but obviously need reviews and stuff
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: iAnomaly, jonathanrainer
Once this PR has been reviewed and has the lgtm label, please ask for approval from wongma7 by writing /assign @wongma7
in a comment. For more information see:The Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
Hey! This is highly desirable by my side... Any time frame on it going merged? Or anyway of using this feature as a tester @jonathanrainer ? @andre-lx maybe you can describe what you did to test the solution? (i.e. build Dockerfile etc.) That would really help me and my case 😄
EDIT: Managed to get it to work (sorry for my noobishness). My use case was that a app from vendor required a storageClass but the problem was that every iteration of the app or version bump a new namespace was created, therefore creating a new EFS AP without the needed files on the previous AP path. This ensured that there was no data lost across every iteration or namespace.
I'm still in the process of finishing the implementation but I'm fairly confident that it should work. Thank you for this! 🥂
Hello, any news on merging this? This will save a lot of trouble. Currently sharing an EFS access point for shared services let's say in a cluster, leads us to create a PV and a PVC in each of the 30 namespaces (one per microservice) that reside in a cluster.
If this could get merged, we can use a shared services storage class with dynamic provisioning instead and a fixed subpath so we could create all those dynamically using only the PVC's and reducing the clutter.
Any news on this one? This will also save us loads of work
@Ashley-wenyizha Sorry I've been so long to sort this out, am really eager to get this merged soon if we can and happy to do whatever's necessary to make that happen :)
Exciting times! So looking forward to this!
Hello, any news on it being merged? This piece would make our life so much easier!
Are there any updates? I'm waiting to be merged.
Hi @Ashley-wenyizha, I know you said https://github.com/kubernetes-sigs/aws-efs-csi-driver/pull/732#issuecomment-1536595665 that this issue would be progressing as well, has there been any further progress? More than happy to help in any way I can :)
Thought it would be useful just to point out this PR resolves a Disaster Recovery issue that can arise when using dynamic provisioning.
Consider - a cluster running disparate workloads, with dynamic PVC's defined in mutiple repo's. Now imagaine the cluster completely dies and requires rebuilding all the deployments from manifests. The present dynamic provisioning model would be very awkward to reconnect the EFS state directories back up to the correct deployment. You'd need to modify all the PVC's (in all their different repos) and essentially convert them to static provisioning to define the path where the files still exist. In which case you may as well just start with static provisioning model....
Thanks to the people who have put considerable effort into this PR - hope it gets merged soon.
Hi Jonathan @jonathanrainer,
Thanks for the patience and apologize about the delay, this PR lgtm, could you give it a rebase and squash into 1-3 commits? We can merge this afterwards and you can continue to revise PR 732
thanks!
bump!
Hi Jonathan @jonathanrainer,
this PR lgtm, could you give it a pull + rebase and squash into 1-3 commits? We can merge this afterwards and you can continue to revise PR 732
thanks!
@Ashley-wenyizha Hi, sorry for the delay, real life stuff got a bit on top of me! But have done what you wanted now and this is ready to merge as and when you'd like :)
@Ashley-wenyizha FYI I had to make a small change to the sanity tests because the EFS API Request won't accept paths more than 100 characters but the sanity attempts to use a volume name of 128 characters. In this case I think it's fine because this is the mechanism by which you'd get around exactly this problem but if you think there's a better solution then let me know
/lgtm /approve
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: Ashley-wenyizha, iAnomaly, jonathanrainer
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [Ashley-wenyizha]
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
This is fantastic. Any idea what version we can expect this feature to be available in?
Thanks for all the work that went into this.
@jonathanrainer It looks like the new properties (subPathPattern
, ensureUniqueDirectory
) were released with helm chart 2.4.9, but not in the driver appVersion 1.6.0 that this helm chart uses.
When will these properties be supported in a released appVersion?
Also, note the typo in the helm docs: s/ensureUniqueDirectories/ensureUniqueDirectory/