glusterfs Segfault when files accessing via libgfapi on pure distributed volumes

Description of problem: When gluster volume type DISTRIBUTED is created and then non-preallocated qcow2 is created on it (via fuse mount), trying to start vm with this vdisk connected via libgfapi or qemu-img info gluster:///...qcow2 causes aegmentation fault.

The exact command to reproduce the issue:

qemu-img info gluster:///

The full output of the command that failed: Segmentation fault. GDB output:

where

#0  0x00007ffff6ac4cb8 in FRAME_DESTROY (frame=0x55555586e119) at ./glusterfs/stack.h:161
#1  STACK_DESTROY (stack=0x55555586b0a8) at ./glusterfs/stack.h:194
#2  syncop_seek (subvol=subvol@entry=0x7fffe4024cf0, fd=fd@entry=0x55555584eb88, offset=offset@entry=196616,
    what=what@entry=GF_SEEK_DATA, xdata_in=xdata_in@entry=0x0, off=off@entry=0x7fffffffe3c8) at syncop.c:3090
#3  0x00007ffff6b94cf6 in glfs_seek (whence=3, offset=196616, glfd=0x55555586d8e0) at glfs-fops.c:936
#4  pub_glfs_lseek (glfd=0x55555586d8e0, offset=196616, whence=3) at glfs-fops.c:988
#5  0x00007ffff7fc5e19 in ?? () from /usr/lib/x86_64-linux-gnu/qemu/block-gluster.so
#6  0x00005555555cd96a in bdrv_open_driver (bs=bs@entry=0x555555795510, drv=drv@entry=0x7ffff7fca9e0,
    node_name=<optimized out>, options=options@entry=0x5555557a1c20, open_flags=20480,
    errp=errp@entry=0x7fffffffe5c0) at ../../block.c:1552
#7  0x00005555555d0e8f in bdrv_open_common (errp=0x7fffffffe5c0, options=0x5555557a1c20, file=0x0, bs=0x555555795510)
    at ../../block.c:1827
#8  bdrv_open_inherit (filename=<optimized out>,
    filename@entry=0x7fffffffeca9 "gluster:///d4/d4/5be9fb50-0438-4e1e-be35-cfc3d82945ca.qcow2",
    reference=<optimized out>, options=0x5555557a1c20, flags=<optimized out>, flags@entry=0,
    parent=parent@entry=0x55555578e030, child_class=child_class@entry=0x55555574c940 <child_of_bds>, child_role=19,
    errp=0x7fffffffe720) at ../../block.c:3747
#9  0x00005555555d1cdd in bdrv_open_child_bs (
    filename=filename@entry=0x7fffffffeca9 "gluster:///d4/d4/5be9fb50-0438-4e1e-be35-cfc3d82945ca.qcow2",
    options=options@entry=0x5555557933a0, bdref_key=bdref_key@entry=0x5555556fce6b "file",
    parent=parent@entry=0x55555578e030, child_class=child_class@entry=0x55555574c940 <child_of_bds>,
    child_role=child_role@entry=19, allow_none=true, errp=0x7fffffffe720) at ../../block.c:3387
#10 0x00005555555d14aa in bdrv_open_inherit (
    filename=filename@entry=0x7fffffffeca9 "gluster:///d4/d4/5be9fb50-0438-4e1e-be35-cfc3d82945ca.qcow2",
    reference=reference@entry=0x0, options=0x5555557933a0, options@entry=0x55555578ccc0, flags=<optimized out>,
    flags@entry=4096, parent=parent@entry=0x0, child_class=child_class@entry=0x0, child_role=0, errp=0x7fffffffe810)
    at ../../block.c:3694
#11 0x00005555555d1fb3 in bdrv_open (
    filename=filename@entry=0x7fffffffeca9 "gluster:///d4/d4/5be9fb50-0438-4e1e-be35-cfc3d82945ca.qcow2",
    reference=reference@entry=0x0, options=options@entry=0x55555578ccc0, flags=flags@entry=4096,
    errp=errp@entry=0x7fffffffe810) at ../../block.c:3840
#12 0x00005555555ead3b in blk_new_open (
    filename=0x7fffffffeca9 "gluster:///d4/d4/5be9fb50-0438-4e1e-be35-cfc3d82945ca.qcow2", reference=0x0,
    errp=errp@entry=0x7fffffffe810) at ../../block.c:3840
#12 0x00005555555ead3b in blk_new_open (
    filename=0x7fffffffeca9 "gluster:///d4/d4/5be9fb50-0438-4e1e-be35-cfc3d82945ca.qcow2", reference=0x0,
--Type <RET> for more, q to quit, c to continue without paging--
    options=0x55555578ccc0, flags=4096, errp=0x7fffffffe810) at ../../block/block-backend.c:435
#13 0x00005555555b564b in img_open_file (
    filename=0x7fffffffeca9 "gluster:///d4/d4/5be9fb50-0438-4e1e-be35-cfc3d82945ca.qcow2", options=0x55555578ccc0,
    fmt=<optimized out>, flags=4096, writethrough=<optimized out>, force_share=<optimized out>,
    quiet=<optimized out>) at ../../qemu-img.c:398
#14 0x00005555555b7442 in img_check (argc=2, argv=<optimized out>) at ../../qemu-img.c:808
#15 0x00005555555b3331 in main (argc=2, argv=<optimized out>) at ../../qemu-img.c:5426

./glusterfs/stack.h                                                                                                │
│   149                     (frm)->root->gid = g;                                              \                     │
│   150                     (frm)->root->ngrps = 0;                                            \                     │
│   151                 }                                                                      \                     │
│   152             } while (0);                                                                                     │
│   153                                                                                                              │
│   154         struct xlator_fops;                                                                                  │
│   155                                                                                                              │
│   156         static inline void                                                                                   │
│   157         FRAME_DESTROY(call_frame_t *frame)                                                                   │
│   158         {                                                                                                    │
│   159             void *local = NULL; Source Available ]                                                           │
│   160                                                                                                              │
│  >161             if (frame->root->ctx->measure_latency)                                                           │
│   162                 gf_frame_latency_update(frame);                                                              │
│   163                                                                                                              │
│   164             list_del_init(&frame->frames);                                                                   │
│   165             if (frame->local) {                                                                              │
│   166                 local = frame->local;                                                                        │
│   167                 frame->local = NULL;

(gdb) p *frame
$4 = {root = 0x55555586b0, parent = 0xa000000000000000, frames = {next = 0x38000055555586b7, prev = 0x55555586b7a0},
  local = 0x5000000000000000, this = 0x555555825d, ret = 0x0, ref_count = 0, lock = {spinlock = 0, mutex = {
      __data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0,
        __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}}, cookie = 0x0,
  complete = false, op = GF_FOP_NULL, begin = {tv_sec = 0, tv_nsec = 0}, end = {tv_sec = 0, tv_nsec = 0},
  wind_from = 0x0, wind_to = 0x0, unwind_from = 0x0, unwind_to = 0x0}

(gdb) p frame->root
$5 = (call_stack_t *) 0x55555586b0
(gdb) p *frame->root
Cannot access memory at address 0x55555586b0

Expected results: normal output

Mandatory info: - The output of the gluster volume info command:

gluster volume info
 
Volume Name: d4
Type: Distribute
Volume ID: c81bc14a-41f5-430b-82c1-cfca6d5bf557
Status: Started
Snapshot Count: 0
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: 2341b40e-2648-4d21-9d25-05e91a63ae5b:/storages/zfs/z122
Brick2: e9e08a03-235f-4adb-aa05-cd31084dac7c:/storages/zfs/z123
Brick3: 46e1a360-697e-4b85-b3d3-d46fffbe5b11:/storages/zfs/z120
Brick4: f97103bd-8dff-4cca-835f-33856bf30588:/storages/zfs/z121
Options Reconfigured:
storage.owner-uid: 931
storage.owner-gid: 931
server.keepalive-count: 5
server.keepalive-interval: 2
server.keepalive-time: 10
server.tcp-user-timeout: 20
network.ping-timeout: 20
cluster.choose-local: off
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: none
cluster.quorum-type: none
cluster.eager-lock: enable
performance.strict-o-direct: on
cluster.lookup-optimize: off
performance.client-io-threads: on
server.event-threads: 4
client.event-threads: 4
user.cifs: off
features.shard: off
network.remote-dio: disable
performance.low-prio-threads: 32
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on

- The output of the gluster volume status command:

gluster volume status
Status of volume: d4
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 2341b40e-2648-4d21-9d25-05e91a63ae5b:
/storages/zfs/z122                          49153     0          Y       128541
Brick e9e08a03-235f-4adb-aa05-cd31084dac7c:
/storages/zfs/z123                          49153     0          Y       127890
Brick 46e1a360-697e-4b85-b3d3-d46fffbe5b11:
/storages/zfs/z120                          49153     0          Y       134036
Brick f97103bd-8dff-4cca-835f-33856bf30588:
/storages/zfs/z121                          49153     0          Y       128475
 
Task Status of Volume d4
------------------------------------------------------------------------------
There are no active volume tasks

- The output of the gluster volume heal command:

No redundancy so ho heal

**- Provide logs present on following locations of client and server nodes -

...Could not see some criminal there but can provide any logs

/var/log/glusterfs/

**- Is there any crash ? Provide the backtrace and coredump see above

Additional info:

I patched QEMU 6.1 so it can access gluster volumes with byte granularity. It fixed for me this issue. I still did not tested an original QEMU but I don't want to use it since it brings much problems with i/o erorrs.
It seems that replicated, distributed replicated and dispersed volumes are not affected.

- The operating system / glusterfs version: Debian 11, Gluster 9.2-1, 9,4-2~bpo11

Apr 12 '22 15:04 olegkrutov

I've just noticed that problem occurs when qcow2 virtual size is not a multiple of 4096 bytes. So, even file creating with virtual size, say, 1073743872 (0x40000800) bytes will cause segfault. File with size 1073745920 (0x40001000) bytes will create and work normally.

Apr 12 '22 22:04 olegkrutov

May I provide some additional info to help solve this problem? I've built qemu (for debian bullseye) or I can provide qemu patch, it is quite trivial.

Apr 17 '22 18:04 olegkrutov

This problem arises and with qemu 7.0 from debian backports repo. Can anybody please help?

Jun 30 '22 17:06 olegkrutov

Thank you for your contributions. Noticed that this issue is not having any activity in last ~6 months! We are marking this issue as stale because it has not had recent activity. It will be closed in 2 weeks if no one responds with a comment here.

Mar 19 '23 22:03 stale[bot]

glusterfs glusterfs copied to clipboard

Segfault when files accessing via libgfapi on pure distributed volumes

glusterfs
glusterfs copied to clipboard