[BUG] NodeUnpublishVolume fail
What is your environment(Kubernetes version, Fluid version, etc.) Kubernetes v1.22.5 Fluid master
Describe the bug
csi-nodeplugin-fluid-xxx
I test fuse recovery and some corrupted mount points are generated.
When I delete application pod, I find that deletion consumes a long time.
Some unexpected logs were found:
nodeserver.go fails to cleanup corrupted mount point when invoke NodeUnpublishVolume.
For corrupted mount points, local variable 'notMount' is true, break unmount loop and then fail to invoke CleanupMountPoint. Maybe we should set 'notMount' to false in this case so that all corrupted mount points would be unmount.
func (ns *nodeServer) NodeUnpublishVolume(ctx context.Context, req *csi.NodeUnpublishVolumeRequest) (*csi.NodeUnpublishVolumeResponse, error) {
.....
mounter := mount.New("")
for {
notMount, err := mounter.IsLikelyNotMountPoint(targetPath)
.....
if err != nil {
if !mount.IsCorruptedMnt(err) {
// stat targetPath with unexpected error
glog.Errorf("NodeUnpublishVolume: stat targetPath %s with error: %v", targetPath, err)
return nil, status.Errorf(codes.Internal, "NodeUnpublishVolume: stat targetPath %s: %v", targetPath, err)
} else {
// targetPath is corrupted
glog.V(3).Infof("NodeUnpublishVolume: detected corrupted mountpoint on path %s with error %v", targetPath, err)
}
}
if notMount {
glog.V(3).Infof("NodeUnpublishVolume: umount %s success", targetPath)
break
}
...
err = mounter.Unmount(targetPath)
...
}
...
err = mount.CleanupMountPoint(targetPath, mounter, false)
...
}
What you expect to happen: All corrupted mount points should be unpublished successfully
How to reproduce it
Additional Information
Hi @maimuderizi, which Fluid version are u using in your cluster? From the log screenshot, I guess that is a Fluid v0.9.X version? There's some bugs in Fluid v0.9.X for FUSE Recovery feature, and they are fixed in Fluid v1.0.0.