Frequent disconectors in Glusterfs logs
Description of problem: In /var/log/glusterfs/bricks/file.log, every 1-2 minutes, records appear with the connection between the Glusterfs nods. With what it can be connected?
The exact command to reproduce the issue:
The full output of the command that failed:
Expected results:
2024-05-27 11:07:48.998898 +0000] I [addr.c:54:compare_addr_and_update] 0-/data/glusterfs/glustervolume: allowed = "*", received addr = "10.14.81.72"
[2024-05-27 11:07:48.998946 +0000] I [login.c:110:gf_auth] 0-auth/login: allowed user names: 8cf1ac60-2527-466b-aefa-b92c1a0280a3
[2024-05-27 11:07:48.998960 +0000] I [MSGID: 115029] [server-handshake.c:647:server_setvolume] 0-glustervolume01-server: accepted client from CTX_ID:c3f7cef0-b31c-4112-b7ff-51780a7bd0d8-GRAPH_ID:0-PID:19372-HOST:servclstfs02-PC_NAME:glustervolume01-client-0-RECON_NO:-0 (version: 11.1) with subvol /data/glusterfs/glustervolume
[2024-05-27 11:07:49.014105 +0000] W [socket.c:754:__socket_rwv] 0-tcp.glustervolume01-server: readv on 10.14.81.72:49142 failed (??? ????????? ??????)
[2024-05-27 11:07:49.014159 +0000] I [MSGID: 115036] [server.c:495:server_rpc_notify] 0-glustervolume01-server: disconnecting connection [{client-uid=CTX_ID:c3f7cef0-b31c-4112-b7ff-51780a7bd0d8-GRAPH_ID:0-PID:19372-HOST:servclstfs02 -PC_NAME:glustervolume01-client-0-RECON_NO:-0}]
[2024-05-27 11:07:49.014343 +0000] I [MSGID: 101054] [client_t.c:375:gf_client_unref] 0-glustervolume01-server: Shutting down connection CTX_ID:c3f7cef0-b31c-4112-b7ff-51780a7bd0d8-GRAPH_ID:0-PID:19372-HOST:servclstfs02 -PC_NAME:glustervolume01-client-0-RECON_NO:-0
[2024-05-27 11:08:41.865887 +0000] I [addr.c:54:compare_addr_and_update] 0-/data/glusterfs/glustervolume: allowed = "*", received addr = "10.14.81.73"
[2024-05-27 11:08:41.865944 +0000] I [login.c:110:gf_auth] 0-auth/login: allowed user names: 8cf1ac60-2527-466b-aefa-b92c1a0280a3
[2024-05-27 11:08:41.865952 +0000] I [MSGID: 115029] [server-handshake.c:647:server_setvolume] 0-glustervolume01-server: accepted client from CTX_ID:7ee9e7cf-d958-4cc3-bdfa-78d0522a4a0c-GRAPH_ID:0-PID:14920-HOST:servclstarb01 -PC_NAME:glustervolume01-client-0-RECON_NO:-0 (version: 11.1) with subvol /data/glusterfs/glustervolume
[2024-05-27 11:08:41.879735 +0000] W [socket.c:754:__socket_rwv] 0-tcp.glustervolume01-server: readv on 10.14.81.73:49144 failed (??? ????????? ??????)
[2024-05-27 11:08:41.879800 +0000] I [MSGID: 115036] [server.c:495:server_rpc_notify] 0-glustervolume01-server: disconnecting connection [{client-uid=CTX_ID:7ee9e7cf-d958-4cc3-bdfa-78d0522a4a0c-GRAPH_ID:0-PID:14920-HOST:servclstarb01 -PC_NAME:glustervolume01-client-0-RECON_NO:-0}]
[2024-05-27 11:08:41.879954 +0000] I [MSGID: 101054] [client_t.c:375:gf_client_unref] 0-glustervolume01-server: Shutting down connection CTX_ID:7ee9e7cf-d958-4cc3-bdfa-78d0522a4a0c-GRAPH_ID:0-PID:14920-HOST:servclstarb01 -PC_NAME:glustervolume01-client-0-RECON_NO:-0
Mandatory info:
- The output of the gluster volume info command:
Volume Name: glustervolume01
Type: Replicate
Volume ID: 3f7d3e1e-4403-4448-b081-40acfb52f05a
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: servclstfs01:/data/glusterfs/glustervolume
Brick2: servclstfs02:/data/glusterfs/glustervolume
Brick3: servclstarb01:/data/glusterfs/glustervolume (arbiter)
Options Reconfigured:
server.event-threads: 4
client.event-threads: 4
performance.readdir-ahead: off
performance.nl-cache-timeout: 600
performance.nl-cache: on
network.inode-lru-limit: 200000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
performance.cache-samba-metadata: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
storage.batch-fsync-delay-usec: 0
performance.parallel-readdir: off
features.inode-quota: off
features.quota: off
network.ping-timeout: 8
transport.address-family: inet
performance.client-io-threads: off
performance.io-thread-count: 8
- The output of the gluster volume status command:
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick servclstfs01:/data/glusterfs/glustervolume 55623 0 Y 5080
Brick servclstfs02:/data/glusterfs/glustervolume 53658 0 Y 5555
Brick servclstarb01:/data/glusterfs/glustervolume 52826 0 Y 5060
Self-heal Daemon on localhost N/A N/A Y 5099
Self-heal Daemon on servclstarb01 N/A N/A Y 5089
Self-heal Daemon on servclstfs02 N/A N/A Y 5584
Task Status of Volume glustervolume01
------------------------------------------------------------------------------
There are no active volume tasks
- The output of the gluster volume heal command:
Brick servclstfs01:/data/glusterfs/glustervolume
Status: Connected
Number of entries: 0
Brick servclstfs02:/data/glusterfs/glustervolume
Status: Connected
Number of entries: 0
Brick servclstarb01:/data/glusterfs/glustervolume
Status: Connected
Number of entries: 0
**- Provide logs present on following locations of client and server nodes - /var/log/glusterfs/
2024-05-26 23:01:03.774591 +0000] I [MSGID: 106496] [glusterd-handshake.c:923:__server_getspec] 0-management: Received mount request for volume glustervolume01
[2024-05-26 23:01:15.058975 +0000] I [MSGID: 106499] [glusterd-handler.c:4536:__glusterd_handle_status_volume] 0-management: Received status volume req for volume glustervolume01
**- Is there any crash ? Provide the backtrace and coredump
Additional info:
- The operating system / glusterfs version: Astra Linux 1.7.5 kernel version 6.1.50 Note: Please hide any confidential data which you don't want to share in public like IP address, file name, hostname or any other configuration
Was quota running on those nodes?
Yes, it was initially turned on for tests, but then it was turned off.