glusterfs io_uring errors when starting glusterd

Description of problem:

We have a three server glusterfs setup. When starting glusterd the service frequently fails to start with the following error logged:

C [gf-io-uring.c:612:gf_io_uring_cq_process_some] (-->/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x7ff76) [0x7f194fc22f76] -->/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x8bf15) [0x7f194fc2ef15] -->/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x8bdd5) [0x7f194fc2edd5] ) 0-: Assertion failed:

The service will typically start and run after several attempts. It will run stably for about 2 week then crash.

All three servers are identical down to the bios versions.

The exact command to reproduce the issue:

$ sudo systemctl start glusterd

The full output of the command that failed:

Job for glusterd.service failed because the control process exited with error code.                                                                                          
See "systemctl status glusterd.service" and "journalctl -xeu glusterd.service" for details.

On running journalctl -xeu glusterd.service this is the output:

Jul 31 14:53:18 srv-003 glusterd[1582227]: [2023-07-31 14:53:18.894347 +0000] C [gf-io-uring.c:612:gf_io_uring_cq_process_some] (-->/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x7ff76) [0x7f194fc22f76] -->/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x8bf15) [0x7f194fc2ef15] -->/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x8bdd5) [0x7f194fc2edd5] ) 0-: Assertion failed:
Jul 31 14:53:18 srv-003 glusterd[1582227]: pending frames:
Jul 31 14:53:18 srv-003 glusterd[1582227]: patchset: git://git.gluster.org/glusterfs.git
Jul 31 14:53:18 srv-003 glusterd[1582227]: signal received: 6
Jul 31 14:53:18 srv-003 glusterd[1582227]: time of crash:
Jul 31 14:53:18 srv-003 glusterd[1582227]: 2023-07-31 14:53:18 +0000
Jul 31 14:53:18 srv-003 glusterd[1582227]: configuration details:
Jul 31 14:53:18 srv-003 glusterd[1582227]: argp 1
Jul 31 14:53:18 srv-003 glusterd[1582227]: backtrace 1
Jul 31 14:53:18 srv-003 glusterd[1582227]: dlfcn 1

Expected results: No output and glusterd running

Mandatory info: - The output of the gluster volume info command:

Volume Name: vol03
Type: Distributed-Disperse
Volume ID: 49f0d0cd-3335-4e08-ae1e-fb56d2a7d685
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: srv-001:/srv/glusterfs/vol03/brick0
Brick2: srv-002:/srv/glusterfs/vol03/brick0
Brick3: srv-003:/srv/glusterfs/vol03/brick0
Options Reconfigured:
performance.cache-size: 1GB
storage.linux-io_uring: off
server.event-threads: 4
client.event-threads: 4
performance.write-behind: off
performance.parallel-readdir: on
performance.readdir-ahead: on
performance.nl-cache-timeout: 600
performance.nl-cache: on
network.inode-lru-limit: 200000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
performance.cache-samba-metadata: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
storage.fips-mode-rchecksum: on
transport.address-family: inet

- The output of the gluster volume status command:

** This is after the glusterd service has successfully started and is running!

Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick srv-001:/srv/glusterfs
/vol03/brick0                               54477     0          Y       5564 
Brick srv-002:/srv/glusterfs
/vol03/brick0                               58095     0          Y       4288 
Brick srv-003:/srv/glusterfs
/vol03/brick0                               50589     0          Y       5319 
Self-heal Daemon on localhost               N/A       N/A        Y       1582991
Self-heal Daemon on srv-002  N/A       N/A        Y       4323 
Self-heal Daemon on srv-001  N/A       N/A        Y       7260 
 
Task Status of Volume vol03
------------------------------------------------------------------------------
There are no active volume tasks

- The output of the gluster volume heal command:

Status: Connected
Number of entries: 0

Brick srv-002:/srv/glusterfs/vol03/brick0
Status: Connected
Number of entries: 0

Brick srv-003:/srv/glusterfs/vol03/brick0
Status: Connected
Number of entries: 0

**- Provide logs present on following locations of client and server nodes - /var/log/glusterfs/

**- Is there any crash ? Provide the backtrace and coredump

Not sure how to do this, happy to if someone can point me in the right direction for what is needed.

Additional info:

Each server has mostly identical hardware composed of the following: CPU: AMD Ryzen 7 5700G RAM: 2x servers have 16Gb and one has 32Gb (this is the only variance) Storage:

2x NVME drives per server
2TB Samsung 970 EVO Plus

The entire storage stack:

EFI partition table per drive
primary drive is boot drive
primary drive has 1.8T LVM partition (after system boot portions)
second drive has matching 1.8T LVM partition
1x volume group contains these partitions
A 1.15TiB logical volume in LVM RAID 1 across the two drives hosts the gluster brick on each server
LV is encrypted using cryptsetup with LUCS
encrypted LV is then mounted using /dev/mapper
The encrypted partition is then formatted with an XFS file system
This is then hosted using glusterd

This is a complex setup driven by a clients security policies though the RAID setup can be removed.

- The operating system / glusterfs version:

# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 23.04
Release:        23.04
Codename:       lunar

# glusterfs --version
glusterfs 11.0
Repository revision: git://git.gluster.org/glusterfs.git
Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.

Last week we were running glusterfs 10.4 with exactly the same issues. Upgraded to 11.0 this weekend to see if that would provide a fix, there has been no change in behavior.

Note: Please hide any confidential data which you don't want to share in public like IP address, file name, hostname or any other configuration

Jul 31 '23 15:07 patabid

I have a hunch it the issue may be related to the LVM2 configuration. I am currently putting together a plan to take each server offline and remove the LVM2 part of the configuration to see if that mitigates the crashes.

The failure of glusterd appears to be very random and does not appear to be related to load. Its happened both under load and at no load.

Aug 02 '23 14:08 patabid

FWIW I can duplicate this on Ubuntu 22.04.4 LTS and using ZFS as a filesystem. Interestingly I see this just when glusterd starts and no volume has been created. While my output is similar it has a bit more information.

May 13 15:01:21 hio-4 systemd[1]: Starting GlusterFS, a clustered file-system server...
░░ Subject: A start job for unit glusterd.service has begun execution
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░ 
░░ A start job for unit glusterd.service has begun execution.
░░ 
░░ The job identifier is 187.
May 13 15:01:21 hio-4 glusterd[1705]: [2024-05-13 15:01:21.846786 +0000] C [gf-io-uring.c:612:gf_io_uring_cq_process_some] (-->/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x7f776) [0x7f849155f776] -->/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x8ba75) [0x7f849156ba75] -->/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x8b935) [0x7f849156b935] ) 0-: Assertion failed:
May 13 15:01:21 hio-4 glusterd[1705]: pending frames:
May 13 15:01:21 hio-4 glusterd[1705]: patchset: git://git.gluster.org/glusterfs.git
May 13 15:01:21 hio-4 glusterd[1705]: signal received: 6
May 13 15:01:21 hio-4 glusterd[1705]: time of crash:
May 13 15:01:21 hio-4 glusterd[1705]: 2024-05-13 15:01:21 +0000
May 13 15:01:21 hio-4 glusterd[1705]: configuration details:
May 13 15:01:21 hio-4 glusterd[1705]: argp 1
May 13 15:01:21 hio-4 glusterd[1705]: backtrace 1
May 13 15:01:21 hio-4 glusterd[1705]: dlfcn 1
May 13 15:01:21 hio-4 glusterd[1705]: libpthread 1
May 13 15:01:21 hio-4 glusterd[1705]: llistxattr 1
May 13 15:01:21 hio-4 glusterd[1705]: setfsid 1
May 13 15:01:21 hio-4 glusterd[1705]: epoll.h 1
May 13 15:01:21 hio-4 glusterd[1705]: xattr.h 1
May 13 15:01:21 hio-4 glusterd[1705]: st_atim.tv_nsec 1
May 13 15:01:21 hio-4 glusterd[1705]: package-string: glusterfs 11.0
May 13 15:01:21 hio-4 glusterd[1705]: ---------
May 13 15:01:22 hio-4 systemd[1]: glusterd.service: Control process exited, code=exited, status=1/FAILURE
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░ 
░░ An ExecStart= process belonging to unit glusterd.service has exited.
░░ 
░░ The process' exit code is 'exited' and its exit status is 1.
May 13 15:01:22 hio-4 systemd[1]: glusterd.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░ 
░░ The unit glusterd.service has entered the 'failed' state with result 'exit-code'.
May 13 15:01:22 hio-4 systemd[1]: Failed to start GlusterFS, a clustered file-system server.

May 13 '24 17:05 geiseri

We upgraded to Ubuntu 23.10 and it appears to have resolved this issue. We have not had any io_uring errors in quite a few months. We will be upgrading to the new 24.04 in short order here, but the gluster-11 ppa has a broken package dependency on 24.04 requiring a compile from scratch.

May 13 '24 17:05 patabid

I'm facing the same issue

└❯ cat -p /etc/os-release
NAME="Fedora Linux"
VERSION="40 (Server Edition)"
ID=fedora
VERSION_ID=40
VERSION_CODENAME=""
PLATFORM_ID="platform:f40"
PRETTY_NAME="Fedora Linux 40 (Server Edition)"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:40"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f40/system-administrators-guide/"
SUPPORT_URL="https://ask.fedoraproject.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=40
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=40
SUPPORT_END=2025-05-13
VARIANT="Server Edition"
VARIANT_ID=server

└❯ hostnamectl
     Static hostname: xxx
           Icon name: computer-desktop
             Chassis: desktop 🖥
          Machine ID: a9da64f824644675b21dbccbb74de75a
             Boot ID: afb3c10113ce4b88aa9e4420e833f0d6
    Operating System: Fedora Linux 40 (Server Edition)
         CPE OS Name: cpe:/o:fedoraproject:fedora:40
      OS Support End: Tue 2025-05-13
OS Support Remaining: 6month 3w
              Kernel: Linux 6.10.12-200.fc40.x86_64
        Architecture: x86-64
     Hardware Vendor: Dell Inc.
      Hardware Model: Precision 3460
    Firmware Version: 2.7.0
       Firmware Date: Tue 2023-07-11
        Firmware Age: 1y 3month 1w 4d

└❯ sudo ausearch -m avc -ts recent
----
time->Sun Oct 20 16:07:01 2024
type=AVC msg=audit(1729433221.887:246904): avc:  denied  { map } for  pid=1824851 comm="glusterfsd" path="anon_inode:[io_uring]" dev="anon_inodefs" ino=5668461 scontext=system_u:system_r:glusterd_t:s0 tcontext=system_u:object_r:io_uring_t:s0 tclass=anon_inode permissive=0
----
time->Sun Oct 20 16:07:02 2024
type=AVC msg=audit(1729433222.001:246905): avc:  denied  { map } for  pid=1824873 comm="glusterfs" path="anon_inode:[io_uring]" dev="anon_inodefs" ino=5668533 scontext=system_u:system_r:glusterd_t:s0 tcontext=system_u:object_r:io_uring_t:s0 tclass=anon_inode permissive=0
----
time->Sun Oct 20 16:07:03 2024
type=AVC msg=audit(1729433223.018:246910): avc:  denied  { map } for  pid=1824886 comm="glusterfs" path="anon_inode:[io_uring]" dev="anon_inodefs" ino=5668584 scontext=system_u:system_r:glusterd_t:s0 tcontext=system_u:object_r:io_uring_t:s0 tclass=anon_inode permissive=0
----
time->Sun Oct 20 16:07:04 2024
type=AVC msg=audit(1729433224.005:246912): avc:  denied  { map } for  pid=1824907 comm="glusterfs" path="anon_inode:[io_uring]" dev="anon_inodefs" ino=5670311 scontext=system_u:system_r:glusterd_t:s0 tcontext=system_u:object_r:io_uring_t:s0 tclass=anon_inode permissive=0

P.S. In my case, the volume get created, everything looks healthy, but the replication does not work.

└❯ sudo gluster volume info xxx

Volume Name: xxx
Type: Replicate
Volume ID: 788dbd9e-f2db-426b-b374-f976d05e0377
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: xxx1:/data/brick1/actually-zemfira
Brick2: xxx2:/data/brick1/actually-zemfira
Options Reconfigured:
features.scrub: Active
features.bitrot: on
cluster.granular-entry-heal: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

Oct 20 '24 14:10 fira42073