resource-agents
resource-agents copied to clipboard
RA Filesystem does not umount OCFS2 when a VirtualDomain has access
Hi, i'm struggling with the Filesystem RA. I realised several times problems. It couldn't umount an OCFS2 Partition, so it failed, and that node got fenced. I had a look into it. I think i found the reason. The script uses "fuser -m /mnt/ocfs2" to find out the processes having access to that mountpoint. I have currently a VirtualDomain running whose raw file resides there.
ha-idg-2:/mnt/share``` # lsof|grep /mnt/ocfs2 qemu-syst 1127 qemu 13ur REG 254,15 171798691840 583946 /mnt/ocfs2/idcc_devel.raw qemu-syst 1127 qemu 14ur REG 254,15 171798691840 583946
/mnt/ocfs2/idcc_devel.rawBut fuser does not show this process:
ha-idg-2:/mnt/share # fuser -m /mnt/ocfs2/
ha-idg-2:/mnt/share #`
So the script does not get a PID it could kill, so umount is not possible, the RA fails and node get fenced.
Is my understanding correct ? Is that a bug or am i missing something ?
System is SLES 12 SP4:
ha-idg-2:/mnt/share # rpm -q resource-agents resource-agents-4.3.018.a7fb5035-3.25.1.x86_64
Bernd
Sorry for the bad formatting. I will learn. Bernd
No worries. You can also edit it and use the Preview tab to see if it's looking as expected.
In this case you probably want the ```
on the line before and the line after your command/output-block.
Hi, i'm struggling with the Filesystem RA. I realised several times problems. It couldn't umount an OCFS2 Partition, so it failed, and that node got fenced. I had a look into it. I think i found the reason. The script uses "fuser -m /mnt/ocfs2" to find out the processes having access to that mountpoint. I have currently a VirtualDomain running whose raw file resides there.
ha-idg-2:/mnt/share # lsof|grep /mnt/ocfs2
qemu-syst 1127 qemu 13ur REG 254,15 171798691840 583946 /mnt/ocfs2/idcc_devel.raw
qemu-syst 1127 qemu 14ur REG 254,15 171798691840 583946 /mnt/ocfs2/idcc_devel.raw
But fuser does not show this process:
ha-idg-2:/mnt/share # fuser -m /mnt/ocfs2/ ha-idg-2:/mnt/share #
So the script does not get a PID it could kill, so umount is not possible, the RA fails and node get fenced.
Is my understanding correct ? Is that a bug or am i missing something ?
System is SLES 12 SP4:
ha-idg-2:/mnt/share # rpm -q resource-agents resource-agents-4.3.018.a7fb5035-3.25.1.x86_64
Bernd
That's strange. Sounds like fuser somehow doesnt detect the processes as using files on that specific mount.
You could try setting force_unmount=safe
to see if that solves the issue (it finds the processes via /proc instead of using the fuser command).
I know for a fact that fuser will not report a defined NFS share on an existing mount, when they are defined. The device will be in use and cannot umount, but you will not see it in fuser.
That's strange. Sounds like fuser somehow doesnt detect the processes as using files on that specific mount.
You could try setting
force_unmount=safe
to see if that solves the issue (it finds the processes via /proc instead of using the fuser command).
Hi oalbright, that solved the problem. Thank you !
Bernd
Great. Glad to hear it.