spec
spec copied to clipboard
GetCapacityRequest: do VolumeCapabilities affect the result?
Kubernetes 1.19 will be the first release which actually does something with the CSI GetCapacity call: https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/1472-storage-capacity-tracking
Because the Kubernetes scheduler cannot communicate directly with the CSI driver, we are caching capacity information. Doing that for all combinations of the GetCapacity parameters is not practicable, so we made some simplifying assumptions. One of them is that the capacity does not depend on the VolumeCapabilities in the GetCapacityRequest. Is that reasonable or are there CSI drivers which really take that into account when calculating the response?
If there is no use-case for this parameter, should it perhaps be officially deprecated?
@pohly yes, we do use that for some cases like: some storage driver does not support certain filesystems, and we want to report a capacity of 0 from the driver if the VolumCapabilities is set to that filesystem.
https://github.com/mesosphere/csilvm/blob/master/pkg/csilvm/server.go#L907-L919
The reason for having VolumeCapabilites in GetCapacity is because GetCapacity and CreateVolume are correlated. As you can see in the comments of GetCapacityRequest.volume_capabilities:
// If specified, the Plugin SHALL report the capacity of the storage
// that can be used to provision volumes that satisfy ALL of the
// specified `volume_capabilities`. These are the same
// `volume_capabilities` the CO will use in `CreateVolumeRequest`.
Some CO might decide to get the capacity for certain storage "profile" (combination of volume capabilities and parameters), and only call CreateVolume if there's enough capacity to provision the new volume with those capabilities and parameters.
If we were to remove VolumeCapabilities in GetCapacity, we probably should remove that from CreateVolume too (thus i don't think that's the right move).
I understand that the two are related in theory. Which parameters matter in practice is what I am trying to find out.
What about VolumeCapability.access_mode? That also is a required field in GetCapacityRequest.
Regarding block vs. filesystem and what the filesystem type is: I agree, this may be useful. The problem just is that in Kubernetes, the choice between block and filesystem is made in the volume claim. At that time, the usefulness of GetCapacity is pretty limited for Kubernetes because CreateVolume might as well just be called directly. I can imagine a scenario were some other CO sends multiple GetCapacity requests to different controllers before picking one for the actual CreateVolume call, but that's not how Kubernetes works. What Kubernetes needs is an estimate that ignores such details.
Speaking of COs, which CO currently uses GetCapacity, and how? This is also relevant in the context of the other issue #432 that I filed about unclear GetCapacity semantic.
@pohly
Speaking of COs, which CO currently uses GetCapacity, and how? This is also relevant in the context of the other issue #432 that I filed about unclear GetCapacity semantic.
Mesos currently uses GetCapacity to determine the capacity for each user specified storage profile (a combination of parameters and volume capabilities), and the scheduler uses that information to make scheduling decisions (e.g., do not schedule a task that requires 500G lvm raid1 disk on a node that only has 300G raid1 capacity).
We also had the "enumeration" problem you mentioned above at that time. We solved the problem by checking capacities that users care. To translate that into Kubernetes concepts, it is similar to calculating capability for each "parameters"+"volume_capability" combo observed in PVCs in the system.
Regarding some of the issues you mentioned in #432, we kind of workaround the issue by calling GetCapacity again after a volume is created to get the updated capacity from the driver (obviously, this has race conditions).
What about VolumeCapability.access_mode?
I don't have a concrete example for VolumeCapability.access_mode, only hypothetical cases where a driver might report different sizes for SINGLE_NODE_WRITER and MULTI_NODE_MULTI_WRITER volumes that it supports.