...client-0: remote operation failed. [{errno=6}, {error=No such device or address}]
Hi guys.
I have a tree-peer replica volume and a very weird "misbehaviour", namely: on fuse-mounted volume - peers are their own clients, mounts are virtually identical on all three peers - I cannot operate with something simple as cp. When I copy a qcow2 file off the mount-point to somewhere else, cp gets stuck forever and Gluster logs:
[2025-06-26 07:05:31.590334 +0000] W [MSGID: 114031] [client-rpc-fops_v2.c:1881:client4_0_seek_cbk] 0-VMsy-client-0: remote operation failed. [{errno=6}, {error=No such device or address}]
[2025-06-26 07:03:55.439601 +0000] I [MSGID: 101219] [common-utils.c:3088:gf_set_volfile_server_common] 0-gluster: duplicate entry for volfile-server [{errno=17}, {error=File exists}]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:1881:client4_0_seek_cbk] 0-VMsy-client-0: remote operation failed. [{errno=6}, {error=No such device or address}]" repeated 61495 times between [2025-06-26 07:05:31.590334 +0000] and [2025-06-26 07:05:55.377423 +0000]
[2025-06-26 07:05:55.377806 +0000] W [MSGID: 114031] [client-rpc-fops_v2.c:1881:client4_0_seek_cbk] 0-VMsy-client-0: remote operation failed. [{errno=6}, {error=No such device or address}]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:1881:client4_0_seek_cbk] 0-VMsy-client-0: remote operation failed. [{errno=6}, {error=No such device or address}]" repeated 310947 times between [2025-06-26 07:05:55.377806 +0000] and [2025-06-26 07:07:55.377755 +0000]
[2025-06-26 07:07:55.378147 +0000] W [MSGID: 114031] [client-rpc-fops_v2.c:1881:client4_0_seek_cbk] 0-VMsy-client-0: remote operation failed. [{errno=6}, {error=No such device or address}]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:1881:client4_0_seek_cbk] 0-VMsy-client-0: remote operation failed. [{errno=6}, {error=No such device or address}]" repeated 309854 times between [2025-06-26 07:07:55.378147 +0000] and [2025-06-26 07:09:55.377740 +0000]
[2025-06-26 07:09:55.378151 +0000] W [MSGID: 114031] [client-rpc-fops_v2.c:1881:client4_0_seek_cbk] 0-VMsy-client-0: remote operation failed. [{errno=6}, {error=No such device or address}]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:1881:client4_0_seek_cbk] 0-VMsy-client-0: remote operation failed. [{errno=6}, {error=No such device or address}]" repeated 311947 times between [2025-06-26 07:09:55.378151 +0000] and [2025-06-26 07:11:55.377963 +0000]
[2025-06-26 07:11:55.378332 +0000] W [MSGID: 114031] [client-rpc-fops_v2.c:1881:client4_0_seek_cbk] 0-VMsy-client-0: remote operation failed. [{errno=6}, {error=No such device or address}]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:1881:client4_0_seek_cbk] 0-VMsy-client-0: remote operation failed. [{errno=6}, {error=No such device or address}]" repeated 312811 times between [2025-06-26 07:11:55.378332 +0000] and [2025-06-26 07:13:55.377891 +0000]
...
Part which makes it weirder- besides failure to copy & errors - is, that one peer-client (10.1.1.101) succeeds, where other two fail this one does cp just fine and log is free from above errors.
-> $ gluster volume info VMsy
Volume Name: VMsy
Type: Replicate
Volume ID: b843d9ea-b500-4b4c-9f0a-f2bae507d491
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 10.1.1.100:/devs/00.GLUSTERs/VMsy
Brick2: 10.1.1.101:/devs/00.GLUSTERs/VMsy
Brick3: 10.1.1.99:/devs/00.GLUSTERs/VMsy-arbiter (arbiter)
Options Reconfigured:
cluster.self-heal-daemon: enable
cluster.entry-self-heal: on
cluster.data-self-heal: on
performance.client-io-threads: off
transport.address-family: inet
storage.fips-mode-rchecksum: on
cluster.granular-entry-heal: on
cluster.shd-max-threads: 3
features.cache-invalidation-timeout: 900
performance.cache-invalidation: on
performance.nl-cache: on
performance.nl-cache-timeout: 600
performance.parallel-readdir: on
performance.readdir-ahead: on
performance.stat-prefetch: on
storage.owner-uid: 107
storage.owner-gid: 107
I suspect it has something to do with sparse files - I copied large, "regular" files and then, all three peers-clients copy off & on the mount-point just fine. All these peers-os-boxes are virtually identical and yet, that one "works", while the other two fail.
Mount-points, those are done with systemd mounts:
-> $ cat /etc/systemd/system/00\\x2dVMsy.mount | grep -v ^#
[Unit]
Description=00-VMsy
After=network.target
After=glusterd.service
Requires=network-online.target
[Mount]
TimeoutSec=60
What=10.1.1.100,10.1.1.101,10.1.1.99:/VMsy
Where=/00-VMsy
Type=glusterfs
Options=volume-name=VMsy,kernel-writeback-cache=1,acl,log-file=/var/log/glusterfs/mount.00-VMsy.log
[Install]
WantedBy=multi-user.target
If this is not a bug in Gluster - I'd very much appreciate your thoughts on what is wrong here and how to make two "failing" peers-clients, work like the box which works. many thanks, L.
@lejeczek Which version of glusterfs is this? It says seek operation has issues.
Version latest available on Centos 9 Stream - glusterfs 11.1 What I also noticed is that du on the mount point, of each peer-client, showed different results, with or without --apparent-size. Also, there are differences to du/df on brick's underlying filesystems, and I don't mean arbiter VS full-bricks, but actual full-brick - again, hardware-wise, fs/fstab-mounted, virtually identical - show differently used filesystem space.
-> $ du --apparent-size -xh --max-depth=1 /devs/00.GLUSTERs/VMsy | sort --human-numeric-sort
4.0K /devs/00.GLUSTERs/VMsy/X-NVRAMs
126G /devs/00.GLUSTERs/VMsy/.glusterfs-anonymous-inode-b843d9ea-b500-4b4c-9f0a-f2bae507d491
147G /devs/00.GLUSTERs/VMsy/.glusterfs
655G /devs/00.GLUSTERs/VMsy
VS
-> $ du --apparent-size -xh --max-depth=1 /devs/00.GLUSTERs/VMsy | sort --human-numeric-sort
4.0K /devs/00.GLUSTERs/VMsy/.glusterfs-anonymous-inode-b843d9ea-b500-4b4c-9f0a-f2bae507d491
4.0K /devs/00.GLUSTERs/VMsy/X-NVRAMs
495G /devs/00.GLUSTERs/VMsy/.glusterfs
924G /devs/00.GLUSTERs/VMsy
I did mv off the mount-point - on full-brick peers - and back onto it and that "fixed" du of qcow2 files on the mount-points, but still - as above - underlying fs show differences.