gcp-compute-persistent-disk-csi-driver icon indicating copy to clipboard operation
gcp-compute-persistent-disk-csi-driver copied to clipboard

Sync Mount/Flush Block Device on NodeUnstage

Open pwschuurman opened this issue 10 months ago • 2 comments

The NodeUnpublishVolume -> NodeUnstageVolume -> ControllerUnpublishVolume flow does not explicitly perform a filesystem sync in linux. The driver calls umount on the staged block device path. If the block device is not mounted elsewhere in the OS, this results in a successful sync and unmount. Files are correctly flushed to the disk.

If the mount is being used elsewhere in the system (via private mount in an alternate namespace), the removal of the driver's mount may not result in the block device being flushed and unmounted. The consequence is that ControllerUnpublishVolume may proceed with removing a block device from a VM while there could be pending block writes at the kernel layer (cached at the filesystme).

This is being done explicitly in windows (https://github.com/kubernetes-csi/csi-proxy/blob/ba9a0a80d071f1ad4650d84abaea043b08e6496e/pkg/os/volume/api.go#L141), but not in linux.

Two options:

  1. Explicitly call sync on the underlying mount prior to unmounting in NodeUnstage
  2. Flush the block device with blockdev --flush during NodeUnstage

pwschuurman avatar Mar 26 '24 18:03 pwschuurman