scylla-operator icon indicating copy to clipboard operation
scylla-operator copied to clipboard

Make sure xfs mount that won't mount because it needs repair is manifested on NodeConfig

Open tnozicka opened this issue 1 year ago • 4 comments

What should the feature do?

When a NodeConfig utilizes mount which systemd unit fails, it needs to set a degraded condition.

What is the use case behind this feature?

xfs is notorious for breaking on unclean restarts and often needs manual action (running xfs_repair to recover). We should manifest that into tha API so it's observable. (And all the other mount failures)

● mnt-persistent\x2dvolumes.mount                                                                                                                        loaded failed failed    Managed mount by Scylla Operator
root@ubuntu-2204:~# systemctl status mnt-persistent\x2dvolumes.mount
Unit mnt-persistentx2dvolumes.mount could not be found.
root@ubuntu-2204:~# journal^C
root@ubuntu-2204:~# systemctl status 'mnt-persistent\x2dvolumes.mount'
× mnt-persistent\x2dvolumes.mount - Managed mount by Scylla Operator
     Loaded: loaded (/etc/systemd/system/mnt-persistent\x2dvolumes.mount; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Fri 2023-11-10 18:50:29 UTC; 29s ago
      Where: /mnt/persistent-volumes
       What: /dev/md127
        CPU: 5ms

Nov 10 18:50:29 ubuntu-2204 systemd[1]: Mounting Managed mount by Scylla Operator...
Nov 10 18:50:29 ubuntu-2204 systemd[1]: mnt-persistent\x2dvolumes.mount: Mount process exited, code=exited, status=32/n/a
Nov 10 18:50:30 ubuntu-2204 mount[3469]: mount: /mnt/persistent-volumes: mount(2) system call failed: Structure needs cleaning.
Nov 10 18:50:29 ubuntu-2204 systemd[1]: mnt-persistent\x2dvolumes.mount: Failed with result 'exit-code'.
Nov 10 18:50:29 ubuntu-2204 systemd[1]: Failed to mount Managed mount by Scylla Operator.

Anything else we need to know?

No response

tnozicka avatar Nov 10 '23 18:11 tnozicka