spec icon indicating copy to clipboard operation
spec copied to clipboard

NodeGetVolumeStats and NodeExpandVolume should tell SP if the path is staging or publish

Open bswartz opened this issue 5 years ago • 5 comments

Because the usage of the staging path and the publish path tends to be rather different for SPs that implement NodeStage/Unstage, it can be difficult to implement NodeGetVolumeStats and NodeExpandVolume properly, due to the fact that the specification says the CO can choose either path to pass in.

For SPs that support stage, the SP is forced to evaluate 3 possibilities each time NodeGetVolumeStats or NodeExpandVolume is called:

  1. The path might be a valid staging path
  2. The path might be a valid publish path
  3. The path might be invalid Reliably making this determination is tough given that the contents of the directory are unpredictable.

There are good reasons to allow the CO to make the decision to pass either the staging or publish path, but it would help a lot if the CO told the SP which path it was passing in. Ideally, the CO would be allowed to pass both paths, but only 1 would be required. Alternatively, the CO could pass a boolean indicating which path. For backwards compatibility reasons though, it's probably best to pass in an enum where the default value (0) means unspecified, 1 means staging and 2 means publish. Then SPs could simply fail on the 0 case and implement only the other 2 cases, which would simplify error checking on the SP side significantly.

bswartz avatar May 01 '19 17:05 bswartz

Why does SP needs to evaluate the path at all? Can you explain the reasoning little bit?

Or to rephrase - can SP work with given path in these API calls and provided path is not invalid - return correct response without determining if given path was staging or publish path? We have implemented this successfully in EBS driver, so I am trying to understand why evaluating the path for staging or publish stage is necessary.

gnufied avatar May 02 '19 00:05 gnufied

In my prototyping CSI plugin, I support both NFS and iSCSI volumes in the same plugin, where the protocol is just one of the options controlled by parameters to CreateVolume. The resize workflow is a bit different for NFS and iSCSI volumes, and it's also different for filesystem and raw block volumes.

At the point of NodeExpandVolume, I need to know the volume protocol and volume access type to know whether I should be resizing any filesystems, rescanning any SCSI buses, or poking any loopback devices.

With these requirements in mind, I've been careful to store all the necessary data in the staging and publish paths so I can do the correct thing in every case. However, due to the CSI spec's expectations about staging paths and publish paths, I've had to structure the data differently. Because I don't know which path I'm getting, I need to execute a bunch of checks on the supplied path to narrow down the possibilities for what it might be until I either determine which case I'm dealing with, or I rule out all the possible cases and realize I've been handed an invalid value.

The hardest case is a filesystem publish path, because that's expected to be the literal mount point, so when I look at it's contents, it contains actual user data which can confuse my checks if the user data looks like the files my CSI plugin would have created in the other 3 cases (a malicious attacker could do this intentionally, so I guard against it). To deal with this I have to play the same tricks kubelet plays and implement an isLikelyNotMountPoint() method to decide if it's safe to even look inside the directory.

If the CO told me what type of path it was giving me, I would remove about 1/3-1/2 of the checks I need to do, and cut down on error cases significantly. Of course I would still have to figure out blocks vs filesystem and iSCSI vs NFS, but there are reliable ways of doing that if I know I have a staging path, and if I have a publish path it's more work but it still a lot easier than what I do today.

bswartz avatar May 02 '19 14:05 bswartz

I forgot to mention that NodeGetVolumeStats() has similar issues with iSCSI vs NFS and blocks vs filesystem, where the SP needs to handle each case differently.

Doing something as simple as making sure the path matches the volume ID (to guard against the case where the CO passes in a valid path but it's to the wrong volume) is tricky when the publish and staging paths have completely different structures.

bswartz avatar May 02 '19 14:05 bswartz

Thanks for clarifying this @bswartz . I have been thinking some more about it and for NodeGetVolumeStats call at least - CO must keep track of the information that path being given to the workload is a staging path or publish path, because that is the path which is being passed by CO (at least in case of Kubernetes) to NodeGetVolumeStats call. In case of k8s at least we do have that information.

It may depend on how a CO uses NodeGetVolumeStats of course, but NodeGetVolumeStats in particular is interesting because it is not necessarily associated with volume lifecycle RPC calls and can be called asyncrhonously by prometheus/statsd etc.

gnufied avatar May 10 '19 20:05 gnufied

While implementing NodeExpand it was found that it is useful to know that the VolumePath in the NodeExpandVolumeRequest is actually the staging_target_path. In our implementation, the staging path is needed to recover information needed to scan the specific LUN for a target after resizing the filesystem. It would be good if the CSI spec stated that the staging_target_path is passed as the VolumePath in the NodeExpandVolumeRequest for plugins that support the STAGE_UNSTAGE_VOLUME capability.

gnarl avatar Aug 22 '19 20:08 gnarl