glusterfs icon indicating copy to clipboard operation
glusterfs copied to clipboard

Transport endpoint disconnection errors during borg backup on GlusterFS Snapshot following upgrade to 11.1

Open ironhalik opened this issue 1 year ago • 9 comments

Description of problem: Scenario: ~3TB and 10 million files gluster volume hosted on three nodes with SSDs, gbit ethernet, 4 cores, 16gb ram Thinly provisioned LVM. I do a snapshot of the volume:

$ gluster snapshot create shapi shapi no-timestamp
$ gluster snapshot activate shapi

mount the snapshot on one of the nodes:

$ mount -t glusterfs "localhost:/snaps/shapi/shapi" "/mnt/snapshot/shapi"

and finally run a borg backup on the snapshot:

$ borg create --info --stats --compression lz4 --upload-ratelimit 10000 "::${datetime}" /mnt/snapshot/

The full output of the command that failed:

2024-06-11 12:32:17 - Sending backup via borg
Creating archive at "ssh://<host>/data/backups/cfn-storage01::240611-1232"
/mnt/snapshot/shapi/media/preflight_thumb/2022/10/16/b37e1018-64b1-4011-be31-9c162c519244/234434-202943-9-54-59-thumb.jpg: stat: [Errno 107] Transport endpoint is not connected: '234434-202943-9-54-59-thumb.jpg'
/mnt/snapshot/shapi/media/preflight_thumb/2022/10/16/cef5bbc0-915e-4fd6-8f5e-550ae7a3be35/352919-202940-9-45-57-thumb.jpg: stat: [Errno 107] Transport endpoint is not connected: '352919-202940-9-45-57-thumb.jpg'
/mnt/snapshot/shapi/media/preflight_thumb/2022/10/16/a97d603c-273e-4e5e-8e93-56e922c07f5f/998265-202936-9-46-55-thumb.jpg: stat: [Errno 107] Transport endpoint is not connected: '998265-202936-9-46-55-thumb.jpg'
[...]
/mnt/snapshot/shapi/media/media_firm: stat: [Errno 107] Transport endpoint is not connected: 'media'
/mnt/snapshot/shapi/media/tmp_addresses_upload: stat: [Errno 107] Transport endpoint is not connected: 'tmp_addresses_upload'
------------------------------------------------------------------------------
Repository: ssh://<host>/data/backups/cfn-storage01
Archive name: 240611-1232
Archive fingerprint: a6af08a14635c68591cdb1d8356f738a103875ba229c618e7a65c2f209d6de9b
Time (start): Tue, 2024-06-11 12:32:20
Time (end):   Tue, 2024-06-11 12:44:39
Duration: 12 minutes 19.69 seconds
Number of files: 34748
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:                6.00 GB              2.61 GB              1.80 MB
All archives:               55.13 TB             49.53 TB              2.71 TB

                       Unique chunks         Total chunks
Chunk index:                 5775603            121522768
------------------------------------------------------------------------------

During creation of the backup (intense read operation), I can see in logs:

# mnt-snapshot-shapi.log
[...]
[2024-06-11 10:40:25.241179 +0000] W [MSGID: 114031] [client-rpc-fops_v2.c:2561:client4_0_lookup_cbk] 0-shapi-client-3: remote operation failed. [{path=/media/preflight_thumb/2022/10/14/964b4d7c-a207-42cb-8fcf-40cf645c0c66/366668-201310-13-33-28-thumb.jpg}, {gfid=bae03e89-0a85-4af1-987e-77d64153ba42}, {errno=2}, {error=No such file or directory}]
[2024-06-11 10:40:25.247712 +0000] W [MSGID: 114061] [client-common.c:530:client_pre_flush_v2] 0-shapi-client-3: remote_fd is -1. EBADFD [{gfid=bae03e89-0a85-4af1-987e-77d64153ba42}, {errno=77}, {error=File descriptor in bad state}]
[2024-06-11 10:40:28.847614 +0000] I [MSGID: 114018] [client.c:2242:client_rpc_notify] 0-shapi-client-1: disconnected from client, process will keep trying to connect glusterd until brick's port is available [{conn-name=shapi-client-1}]
[2024-06-11 10:40:28.847984 +0000] E [rpc-clnt.c:313:saved_frames_unwind] (--> /lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1db)[0x773bf525741b] (--> /lib/x86_64-linux-gnu/libgfrpc.so.0(+0x751f)[0x773bf51f551f] (--> /lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x11d)[0x773bf51fc8bd] (--> /lib/x86_64-linux-gnu/libgfrpc.so.0(+0xf6c0)[0x773bf51fd6c0] (--> /lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x27)[0x773bf51f8767] ))))) 0-shapi-client-1: forced unwinding frame type(GlusterFS 4.x v1) op(OPENDIR(20)) called at 2024-06-11 10:40:27 +0000 (xid=0x320b8)
[2024-06-11 10:40:28.848004 +0000] E [MSGID: 114031] [client-rpc-fops_v2.c:2470:client4_0_opendir_cbk] 0-shapi-client-1: remote operation failed. [{path=/media/preflight_thumb/2022/10/14/c5c3e629-e040-4b86-9a38-f4d1f92fffc9}, {gfid=88550deb-8eb1-4681-b09c-afce7a74e406}, {errno=107}, {error=Transport endpoint is not connected}]
[2024-06-11 10:40:31.852180 +0000] E [MSGID: 114058] [client-handshake.c:946:client_query_portmap_cbk] 0-shapi-client-1: failed to get the port number for remote subvolume. Please run gluster volume status on server to see if brick process is running []
[2024-06-11 10:40:31.852220 +0000] I [MSGID: 114018] [client.c:2242:client_rpc_notify] 0-shapi-client-1: disconnected from client, process will keep trying to connect glusterd until brick's port is available [{conn-name=shapi-client-1}]
[2024-06-11 10:40:33.790369 +0000] W [MSGID: 108027] [afr-common.c:2889:afr_attempt_readsubvol_set] 0-69d767ccba4f4f9bb1dccc3282f8942b-replicate-0: no read subvols for /media/preflight_thumb/2022/10/16/b37e1018-64b1-4011-be31-9c162c519244/234434-202943-9-54-59-thumb.jpg
[2024-06-11 10:40:33.790392 +0000] W [fuse-bridge.c:1052:fuse_entry_cbk] 0-glusterfs-fuse: 417019: LOOKUP() /media/preflight_thumb/2022/10/16/b37e1018-64b1-4011-be31-9c162c519244/234434-202943-9-54-59-thumb.jpg => -1 (Transport endpoint is not connected)
[2024-06-11 10:40:33.824006 +0000] W [MSGID: 108027] [afr-common.c:2889:afr_attempt_readsubvol_set] 0-69d767ccba4f4f9bb1dccc3282f8942b-replicate-0: no read subvols for /media/preflight_thumb/2022/10/16/cef5bbc0-915e-4fd6-8f5e-550ae7a3be35/352919-202940-9-45-57-thumb.jpg
[2024-06-11 10:40:33.824022 +0000] W [fuse-bridge.c:1052:fuse_entry_cbk] 0-glusterfs-fuse: 417089: LOOKUP() /media/preflight_thumb/2022/10/16/cef5bbc0-915e-4fd6-8f5e-550ae7a3be35/352919-202940-9-45-57-thumb.jpg => -1 (Transport endpoint is not connected)
[...]
# bricks/run-gluster-snaps-<id>-brick3-gluster-internal-shapi.log
[...]
[2024-06-11 10:32:15.040076 +0000] I [MSGID: 115029] [server-handshake.c:645:server_setvolume] 0-69d767ccba4f4f9bb1dccc3282f8942b-server: accepted client from CTX_ID:d1788551-5718-4510-86c1-b23fcd70ba58-GRAPH_ID:0-PID:538971-HOST:cfn-storage01-PC_NAME:shapi-client-3-RECON_NO:-0 (version: 11.1) with subvol /run/gluster/snaps/69d767ccba4f4f9bb1dccc3282f8942b/brick3/gluster-internal/shapi
[2024-06-11 10:32:15.055272 +0000] I [rpcsvc.c:849:rpcsvc_handle_rpc_call] 0-rpc-service: spawned a request handler thread for queue 0
[2024-06-11 10:32:41.129354 +0000] I [rpcsvc.c:849:rpcsvc_handle_rpc_call] 0-rpc-service: spawned a request handler thread for queue 1

[backup happening]

[2024-06-11 10:45:22.850187 +0000] W [socket.c:751:__socket_rwv] 0-tcp.69d767ccba4f4f9bb1dccc3282f8942b-server: readv on 127.0.0.1:49138 failed (No data available)
[2024-06-11 10:45:22.850406 +0000] I [MSGID: 115036] [server.c:494:server_rpc_notify] 0-69d767ccba4f4f9bb1dccc3282f8942b-server: disconnecting connection [{client-uid=CTX_ID:d1788551-5718-4510-86c1-b23fcd70ba58-GRAPH_ID:0-PID:538971-HOST:cfn-storage01-PC_NAME:shapi-client-3-RECON_NO:-0}]
[2024-06-11 10:45:22.850660 +0000] I [MSGID: 101054] [client_t.c:374:gf_client_unref] 0-69d767ccba4f4f9bb1dccc3282f8942b-server: Shutting down connection CTX_ID:d1788551-5718-4510-86c1-b23fcd70ba58-GRAPH_ID:0-PID:538971-HOST:cfn-storage01-PC_NAME:shapi-client-3-RECON_NO:-0
[2024-06-11 10:45:24.041267 +0000] W [glusterfsd.c:1427:cleanup_and_exit] (-->/lib/x86_64-linux-gnu/libc.so.6(+0x9ca94) [0x7b28bfc9ca94] -->/usr/sbin/glusterfsd(+0x13805) [0x596b8e039805] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x67) [0x596b8e031987] ) 0-: received signum (15), shutting down
# glusterd.log
[2024-06-11 12:51:05.954732 +0000] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2024-06-11 12:51:06.277648 +0000] I [MSGID: 106496] [glusterd-handshake.c:922:__server_getspec] 0-management: Received mount request for volume /snaps/shapi/shapi
[2024-06-11 12:51:06.303140 +0000] I [MSGID: 106499] [glusterd-handler.c:4535:__glusterd_handle_status_volume] 0-management: Received status volume req for volume shapi
[2024-06-11 12:51:07.822320 +0000] I [glusterd-utils.c:6446:glusterd_brick_start] 0-management: starting a fresh brick process for brick /run/gluster/snaps/8ef9e88c97a7482d8c2417ac578cfa89/brick3/gluster-internal/shops
[2024-06-11 12:51:07.368341 +0000] I [MSGID: 106499] [glusterd-handler.c:4535:__glusterd_handle_status_volume] 0-management: Received status volume req for volume shapi
[2024-06-11 12:51:07.829305 +0000] I [MSGID: 106496] [glusterd-handshake.c:922:__server_getspec] 0-management: Received mount request for volume /snaps/shops/8ef9e88c97a7482d8c2417ac578cfa89/8ef9e88c97a7482d8c2417ac578cfa89.cfn-storage01.run-gluster-snaps-8ef9e88c97a7482d8c2417ac578cfa89-brick3-gluster-internal-shops
[2024-06-11 12:51:07.829335 +0000] W [MSGID: 106048] [glusterd-handshake.c:114:get_snap_volname_and_volinfo] 0-management: failed to get the volinfo for the volume 8ef9e88c97a7482d8c2417ac578cfa89
[2024-06-11 12:51:07.864328 +0000] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2024-06-11 12:51:08.706969 +0000] I [MSGID: 106499] [glusterd-handler.c:4535:__glusterd_handle_status_volume] 0-management: Received status volume req for volume shapi
The message "I [MSGID: 106499] [glusterd-handler.c:4535:__glusterd_handle_status_volume] 0-management: Received status volume req for volume shapi" repeated 88 times between [2024-06-11 12:51:08.706969 +0000] and [2024-06-11 12:52:42.568572 +0000]
[2024-06-11 12:52:43.631935 +0000] I [MSGID: 106499] [glusterd-handler.c:4535:__glusterd_handle_status_volume] 0-management: Received status volume req for volume shapi
The message "I [MSGID: 106499] [glusterd-handler.c:4535:__glusterd_handle_status_volume] 0-management: Received status volume req for volume shapi" repeated 112 times between [2024-06-11 12:52:43.631935 +0000] and [2024-06-11 12:54:43.055195 +0000]
[2024-06-11 12:54:44.118804 +0000] I [MSGID: 106499] [glusterd-handler.c:4535:__glusterd_handle_status_volume] 0-management: Received status volume req for volume shapi
The message "I [MSGID: 106499] [glusterd-handler.c:4535:__glusterd_handle_status_volume] 0-management: Received status volume req for volume shapi" repeated 111 times between [2024-06-11 12:54:44.118804 +0000] and [2024-06-11 12:56:42.588066 +0000]
[2024-06-11 12:56:43.658896 +0000] I [MSGID: 106499] [glusterd-handler.c:4535:__glusterd_handle_status_volume] 0-management: Received status volume req for volume shapi
The message "I [MSGID: 106499] [glusterd-handler.c:4535:__glusterd_handle_status_volume] 0-management: Received status volume req for volume shapi" repeated 112 times between [2024-06-11 12:56:43.658896 +0000] and [2024-06-11 12:58:43.172517 +0000]
[2024-06-11 12:58:44.236678 +0000] I [MSGID: 106499] [glusterd-handler.c:4535:__glusterd_handle_status_volume] 0-management: Received status volume req for volume shapi
The message "I [MSGID: 106499] [glusterd-handler.c:4535:__glusterd_handle_status_volume] 0-management: Received status volume req for volume shapi" repeated 111 times between [2024-06-11 12:58:44.236678 +0000] and [2024-06-11 13:00:42.821282 +0000]
[2024-06-11 13:00:43.884134 +0000] I [MSGID: 106499] [glusterd-handler.c:4535:__glusterd_handle_status_volume] 0-management: Received status volume req for volume shapi

The issues start after ~10 minutes of read operations. Interestingly, if I modify the volume with:

gluster volume set shapi group metadata-cache
gluster volume set shapi performance.readdir-ahead on
gluster volume set shapi performance.parallel-readdir on
gluster volume set shapi client.event-threads 3
gluster volume set shapi server.event-threads 3
gluster volume set shapi cluster.lookup-optimize on
gluster volume set shapi performance.cache-size 512MB

it does fail as well, but after about an hour. So seems to be more resilient.

Expected results: All read operations successful. Backup done.

Mandatory info: - The output of the gluster volume info command:

root@cfn-storage01:~# gluster volume info shapi

Volume Name: shapi
Type: Replicate
Volume ID: 5e4983aa-13a6-4d29-b148-eb8d2c995255
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: cfn-storage02:/data/gluster-internal/shapi
Brick2: cfn-storage03:/data/gluster-internal/shapi
Brick3: cfn-storage01:/data/gluster-internal/shapi
Options Reconfigured:
transport.address-family: inet
cluster.shd-wait-qlength: 2048
cluster.shd-max-threads: 4
features.barrier: disable
performance.client-io-threads: off
auth.allow: 10.*
storage.fips-mode-rchecksum: on

- The output of the gluster volume status command:

root@cfn-storage01:~# gluster volume status shapi
Status of volume: shapi
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick cfn-storage02:/data/gluster-internal/
shapi                                       59557     0          Y       158993
Brick cfn-storage03:/data/gluster-internal/
shapi                                       59794     0          Y       147742
Brick cfn-storage01:/data/gluster-internal/
shapi                                       60209     0          Y       496521
Self-heal Daemon on localhost               N/A       N/A        Y       12455
Self-heal Daemon on cfn-storage03           N/A       N/A        Y       11762
Self-heal Daemon on cfn-storage02           N/A       N/A        Y       6343

Task Status of Volume shapi
------------------------------------------------------------------------------
There are no active volume tasks

- The output of the gluster volume heal command: Note: there is currently a heal operation in progress, but the issue was happening before it as well.

**- Provide logs present on following locations of client and server nodes - /var/log/glusterfs/

**- Is there any crash ? Provide the backtrace and coredump

Additional info:

- The operating system / glusterfs version: Ubuntu 24.04 Gluster 11.1 Recently upgraded from 10.5 which worked fine.

ironhalik avatar Jun 11 '24 13:06 ironhalik

I've seen this "File descriptor in bad state" issue pop up several times also, it causes files not to heal. There's another bug report somewhere over here mentioning it. A temporary solution is to restart the volume (gluster volume volname stop && gluster volume volname start).

babipanghang avatar Jul 01 '24 15:07 babipanghang

Unfortunately, restarting the volume doesn't help in any way.

ironhalik avatar Jul 01 '24 15:07 ironhalik

We have the same issue (also pops up when using Borg, but also for some other operations) but for us it's glusterfs-server 10.5 (10.5-ubuntu1~jammy1), running on Ubuntu 22.04.

We updated to 10.5-ubuntu1~jammy1 on 2023-11-28 and since then we have been having these issues; no issues before.

GlusterFS is mounted with an entry similar to this in /etc/fstab (options are identical, just masking the FQDN and mount names): storage10.example.com:/gluster-data /mnt/gluster-data glusterfs defaults,_netdev 0 0

Basically the same situation as OP, though for us it's 6 nodes and 3 arbiters: Number of Bricks: 3 x (2 + 1) = 9 No heals in progress, under regular usage it performs normally.

RedHat also seems to have a similar issue open but it seems that this has been open for a while: https://access.redhat.com/solutions/5089741

I don't know if it is related to this issue though because I cannot see the details without an account.

Restarting the volumes doesn't fix it.

svenBrumble avatar Aug 07 '24 14:08 svenBrumble

@svenBrumble - upgrade from which version?

mykaul avatar Aug 07 '24 14:08 mykaul

@mykaul Hard to figure out since it doesn't seem to have been documented. Can't find any reference to the previous version in the logs (probably because it's more than 6 months ago). But since automatic updates are enabled on the servers, it has to be the one before 10.5. Our package source is

http://ppa.launchpad.net/gluster/glusterfs-10/ubuntu jammy main

Edit: My guess it was 10.4-ubuntu1~jammy1

svenBrumble avatar Aug 07 '24 14:08 svenBrumble

There are very few changes between 10.4 to 10.5 - see https://github.com/gluster/glusterfs/compare/v10.4...v10.5 My suspicion is that it's been there before, and you are hitting it just because you've restarted Gluster or update to the OS or something.

mykaul avatar Aug 07 '24 15:08 mykaul

That might well be. Sorry I cannot provide more information. I'll try to get more information from our other clusters. We got three of them, two with 9 bricks each with the 3x(2+1) config and one with 3 bricks in a 2+1 config. All are affected.

svenBrumble avatar Aug 07 '24 15:08 svenBrumble

[2024-06-11 10:45:22.850187 +0000] W [socket.c:751:__socket_rwv] 0-tcp.69d767ccba4f4f9bb1dccc3282f8942b-server: readv on 127.0.0.1:49138 failed (No data available)

It's a pity errno is not printed, only its strerror() - I can't find this description ('No data available') as an errno. I would guess @ https://github.com/gluster/glusterfs/blob/ea377b99fdccc96c4d278f5892455425a70e70c2/rpc/rpc-transport/socket/src/socket.c#L727 we set errno to ENODATA, and that trickles down to this code. Perhaps we should ignore this - perhaps 0 is legit in some cases?

mykaul avatar Aug 07 '24 15:08 mykaul

Having the same issue on Ubuntu 24.04 w/ Gluster 11.1 w/ config Number of Bricks: 6 x (2 + 1) = 18 running borg

Previously was using latest Gluster 9.x with no issues over at least two years.

Interestingly - until now - I only have two failing nodes, sometimes failing only a single one, sometimes both at the same time - causing trouble.

On the failing node, I can stop glusterd, kill all glusterfsd. ps ax | grep glusterfsd then will show the problematic glusterfsd, because it is refusing to die. You need to kill -s SIGKILL pid of this process then gracefully shutdown the self-heal daemon (if applicable) then restart glusterd and you can resume operations. (if only one node is affected, everything stays online)

The two failing nodes still run a 5.15 branch kernel (scheduled for upgrade), the other node has already been upgraded to 6.8 branch and for now not shown this problem.

Will update this issue, when kernel upgrade happend on the other machines.

apunkt avatar Oct 15 '24 10:10 apunkt