datashim icon indicating copy to clipboard operation
datashim copied to clipboard

Multi user support in NFS shares

Open viktoriaas opened this issue 4 years ago • 6 comments

Hi, another question 😅 Have you thought about how to prevent users from mounting pvc's of others in NFS?

We have one export share. When user is creating a Dataset, he/she needs to specify the path. Let's say the path is /nfs/export and option createDirPVC: "true". In this case, the user gets his/her own share at /nfs/export/myshare. However, nothing stops user from mounting whole export just by sepcifying path as /nfs/export and setting createDirPVC: "false"

I think this is a great issue in multitenancy environments and therfore unusable because of big security problem. Maybe if helm chart was up and ready for use, the path could be configurable somewhere in values and the user wouldn't actually have to specify if he/she wants to create a directory but default setting would be to create a directory with name path + Dataset name and mount only the resulting path in PVC.

viktoriaas avatar Jan 21 '21 16:01 viktoriaas

That's an interesting issue. There are certainly few ways we can approach that. Was thinking the following: you could prevent users from working the datasets all together. It's straightforward to modify the RBAC but let me know if you have problems with that. They would still be allowed to use the PVCs created from the Datasets.

YiannisGkoufas avatar Jan 25 '21 10:01 YiannisGkoufas

That could be an option, however, it gets too static then. We aim for dynamic provisioning because with amount of PVC created on-demand, it wouldn't be sustainable to work like this. We have some workflows which create PVCs on the fly and right now, we are using some deprecated chart because with dlf, we would have to add everywhere Dataset creation (unrealistic) but we really want to use this framework.

The best would be to provide dynamic provisioning. Next to this solution would be to allow users to deploy datasets (so it is at least a bit flexible - they don't have to ask for every piece of storage) but have export path hidden from users. Then, the directory would be always created according to Dataset name on this path and PVC with same name provided.

I think the option of mounting already existing directory would cease to exists as the problem of mounting someone else's dir would still exist. But I don't think it's a big loss - if you want to have certain files preprepared, use initContainer to copy/create them in Deployement or Pod.

How do you feel about that? Have you thought about dynamic provisiong or there are other problems standing in way?

viktoriaas avatar Jan 25 '21 10:01 viktoriaas

Hi @viktoriaas thanks a lot for the details on your use case. It definitely makes sense, I am just trying to model it in a way that the conventions are still valid for DLF. Would something like this work:

  • We introduce a new CRD DatasetBase which will have the same specs with Dataset plus few more. For instance "allowOverride"
  • We add the functionality so that the Dataset can inherit the specs from a DatasetBase
  • The admin creates a DatasetBase with any root they want
  • The users are able to create Datasets extending the DatasetBase the admins specify That way if the admin has set "allowOverride": false, the users won't be able to mount any path they want on NFS Bear in mind that this requires a bit of work from our side.

We really appreciate the fact that you have embraced the framework so much and coming up with all those ideas and contributions!

YiannisGkoufas avatar Jan 25 '21 17:01 YiannisGkoufas

@YiannisGkoufas The steps you described sound great. Surely I don't expect you to have them done in a second 😄 We will wait and until then use our current solution. Let me know when you need something or would like to test!


Just a side note, have you thought about supporting dynamic provisioning or this is the final solution?

viktoriaas avatar Jan 26 '21 10:01 viktoriaas

The current functionality is already considered to be dynamic provisioning, in the sense that we create on the fly persistent volume claims. Now if you prefer to work with Storage Classes instead of DatasetBase, Dataset you can use directly the CSI NFS provisioner we bundle with DLF as well: https://github.com/kubernetes-csi/csi-driver-nfs/blob/master/deploy/example/README.md

YiannisGkoufas avatar Jan 27 '21 10:01 YiannisGkoufas

I assume that if I create nfs StorageClass (let's call it csi-nfs), then it is enough to specify storageClass in PVC and volume and everything else will be created.

Now, I have problems with understanding all the steps that have to be done. I tried to deploy csi-driver-nfs according to link you've supported (sorry, I haven't seen it before because there is not a direct link from main dlf repo :) ) but wasn't successful at all. In the end, even Dataset creation stopped working. However, I would like to know the steps I have to take to enable deployment only by defining StoraeClass in PVC.

  1. The first step is to setup a NFS server on Kubernetes cluster. However, if I have an existing server and share path? It is described here that new service will be created

which exposes the NFS server endpoint nfs-server.default.svc.cluster.local and the share path /.

But I have some other IP and share path. But anyway, I used provided command to deploy nfs-service

  1. Then I should install nfs-csi driver with provided link, that's okay, installation successful.
kube-system     csi-nfs-controller-7fb595656-mbm6p               3/3     Running        0          35m
kube-system     csi-nfs-controller-7fb595656-pnxdt               3/3     Running        0          35m
  1. Then I should deploy https://raw.githubusercontent.com/kubernetes-csi/csi-driver-nfs/master/deploy/example/storageclass-nfs.yaml to have StorageClass but this file again features that weird path and export. Here I've changed the server and share to our server IP and share path.
csi-nfs (default)   nfs.csi.k8s.io                         Retain          Immediate           false                  3h10m
  1. I create PVC aaand nothing happens. Stays in Pending forever.
galaxy-ns     galaxy-galaxy-pvc               Pending                                                                        csi-nfs        5m8s

logs from nfs-controller:

E0127 15:58:50.450860       1 utils.go:89] GRPC error: rpc error: code = Internal desc = failed to mount nfs server: rpc error: code = Internal desc = mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t nfs -o hard,nfsvers=4.1 147.251.6.50:/gpfs/vol1/nfs/wes /tmp/pvc-6cced430-5acf-4644-91a7-b279559e3386
Output: mount.nfs: Operation not permitted
I0127 15:58:52.410865       1 utils.go:84] GRPC call: /csi.v1.Controller/CreateVolume
I0127 15:58:52.410979       1 utils.go:85] GRPC request: {"capacity_range":{"required_bytes":10737418240},"name":"pvc-ee3cfb7f-14ec-4a81-9689-a5583842ad9a","parameters":{"server":"147.251.6.50","share":"/gpfs/vol1/nfs/wes/"},"volume_capabilities":[{"AccessType":{"Mount":{"mount_flags": ["hard","nfsvers=4.1"]}},"access_mode":{"mode":5}}]}
I0127 15:58:52.418528       1 controllerserver.go:249] internally mounting 147.251.6.50:/gpfs/vol1/nfs/wes at /tmp/pvc-ee3cfb7f-14ec-4a81-9689-a5583842ad9a
I0127 15:58:52.418640       1 nodeserver.go:77] NodePublishVolume: volumeID(147.251.6.50/gpfs/vol1/nfs/wes/pvc-ee3cfb7f-14ec-4a81-9689-a5583842ad9a) source(147.251.6.50:/gpfs/vol1/nfs/wes) targetPath(/tmp/pvc-ee3cfb7f-14ec-4a81-9689-a5583842ad9a) mountflags([hard nfsvers=4.1])
I0127 15:58:52.418728       1 mount_linux.go:146] Mounting cmd (mount) with arguments (-t nfs -o hard,nfsvers=4.1 147.251.6.50:/gpfs/vol1/nfs/wes /tmp/pvc-ee3cfb7f-14ec-4a81-9689-a5583842ad9a)
E0127 15:58:52.645433       1 mount_linux.go:150] Mount failed: exit status 32

so I lowered nfsvers to 3. Logs then:

I0127 16:06:46.754177       1 mount_linux.go:146] Mounting cmd (mount) with arguments (-t nfs -o hard,nfsvers=3 147.251.6.50:/gpfs/vol1/nfs/wes /tmp/pvc-887aec5b-8e43-4665-967a-e4f8d0248a1c)
E0127 16:06:49.454905       1 mount_linux.go:150] Mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t nfs -o hard,nfsvers=3 147.251.6.50:/gpfs/vol1/nfs/wes /tmp/pvc-887aec5b-8e43-4665-967a-e4f8d0248a1c
Output: 
E0127 16:06:49.455350       1 utils.go:89] GRPC error: rpc error: code = Internal desc = failed to mount nfs server: rpc error: code = Internal desc = mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t nfs -o hard,nfsvers=3 147.251.6.50:/gpfs/vol1/nfs/wes /tmp/pvc-887aec5b-8e43-4665-967a-e4f8d0248a1c
Output: 

How can I fix this? This would be very nice to have.

EDIT I'm keeping all above if someone comes with same issue but we found out that nfs-controller is failing on memory. Dmesg output:

[ 4288.088143] Memory cgroup out of memory: Killed process 153388 (mount.nfs) total-vm:156416kB, anon-rss:70416kB, file-rss:4200kB, shmem-rss:0kB, UID:0

In this file we have increased limits on this line and this one 10 times (just added a 0). Everything works as expecting, I think this solution is even better than Dataset + Dataset Base because noone can mount anything else. Do you want to still work on Dataset? I think this is perfect.

viktoriaas avatar Jan 27 '21 17:01 viktoriaas

Closing issue as this is not in our present planning

srikumar003 avatar Jan 27 '23 10:01 srikumar003