longhorn-manager icon indicating copy to clipboard operation
longhorn-manager copied to clipboard

fix(csi): add node info to failed mount logging.

Open james-munson opened this issue 1 year ago • 2 comments
trafficstars

Which issue(s) this PR fixes:

Issue # https://github.com/longhorn/longhorn/issues/7931

What this PR does / why we need it:

Log with kernel release and OS info when an NFS mount fails, to aid in troubleshooting.

Special notes for your reviewer:

Additional documentation or context

james-munson avatar Mar 09 '24 00:03 james-munson

Right, the first draft of this PR used sys.GetKernelRelease() from go-common-libs for the kernel release. But to get the OS distro was a little trickier. ns.GetOSDistro() does an nsenter using the path /host/proc/ns/1/mnt which exists in a "normal" pod, but not in the longhorn-csi-plugin, so the call fails. There the right path would be /rootfs/proc/ns/1/mnt. So we can't use the ns pkg. But sys.GetOSDistro(osReleaseContent string) isn't really directly usable. It is just a utiilty method for ns.GetOsDistro() which feeds it the contents of the /etc/os-release file to parse.

I figured I would just use the result from uname -v which would be something like

 #74-Ubuntu SMP Wed Feb 22 14:14:39 UTC 2023.

and so I borrowed the parsing code from sys.GetKernelRelease and did the equivalent of uname -rv

If I read the contents of the local /etc/os-release without doing the namespace stuff, and use that to call sys.GetOSDistro(), I'll get the info for the pod itself, which is

longhorn-csi-plugin-bchgv:/ # cat /etc/os-release 
NAME="SLES"
VERSION="15-SP5"
VERSION_ID="15.5"
PRETTY_NAME="SUSE Linux Enterprise Server 15 SP5"
ID="sles"
ID_LIKE="suse"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:15:sp5"
DOCUMENTATION_URL="https://documentation.suse.com/"

which is clearly wrong, so I'll need to do an equivalent nsenter but to the /rootfs/proc path to get this for the host:

root@jbm-u22-pool2-ca5a5aa4-25fzl:~# cat /etc/os-release 
PRETTY_NAME="Ubuntu 22.04.2 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.2 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

BTW, GetOSDistro returns the value of the ID= tag, but for purposes of this logging, I would think that PRETTY_NAME would be more informative.

james-munson avatar Mar 11 '24 20:03 james-munson

Revised to use the go-common-libs/sys utilities. I had to make a local copy of RunFunc to use the CSI plugin's different location for /proc and some slightly modified flavors of the common-libs/ns utilities.

james-munson avatar Mar 15 '24 22:03 james-munson

One other option: Mount the host's / on /host for the CSI deployment(s). The path /rootfs is set in this code: https://github.com/longhorn/longhorn-manager/blame/103bcaaae5e080dc93af66bf0f8355c31d2f1b0e/csi/deployment.go#L417 and I'm not sure whether the comment is asserting that this specific path is required. It would be handy if it could follow the same pattern as other host root mounts.

It appears that it is, though, looking at, say, https://github.com/kubernetes/utils/blob/4693a0247e57ebeacb9f261f07a03c9ecd569f44/nsenter/nsenter.go#L37 I'm not quite sure of the scope of that requirement.

james-munson avatar Mar 28 '24 20:03 james-munson

One other option: Mount the host's / on /host for the CSI deployment(s). The path /rootfs is set in this code: 103bcaa/csi/deployment.go#L417 (blame) and I'm not sure whether the comment is asserting that this specific path is required. It would be handy if it could follow the same pattern as other host root mounts.

I think we can change to /host directly, since /rootfs was introduced in the older RWX implementation? You can run RWX related test cases to see if there will be any regression.

innobead avatar Mar 29 '24 15:03 innobead

That's right. Manual and regression testing are happy with the change to mount host's / on /host. I have pushed up the change.

james-munson avatar Mar 29 '24 17:03 james-munson

@mergify backport v1.6.x v1.5.x

innobead avatar Mar 29 '24 21:03 innobead

backport v1.6.x v1.5.x

✅ Backports have been created

mergify[bot] avatar Mar 29 '24 21:03 mergify[bot]