glusterfs icon indicating copy to clipboard operation
glusterfs copied to clipboard

Frequent disconectors in Glusterfs logs

Open Captd65 opened this issue 1 year ago • 2 comments

Description of problem: In /var/log/glusterfs/bricks/file.log, every 1-2 minutes, records appear with the connection between the Glusterfs nods. With what it can be connected?

The exact command to reproduce the issue:

The full output of the command that failed:

Expected results:

2024-05-27 11:07:48.998898 +0000] I [addr.c:54:compare_addr_and_update] 0-/data/glusterfs/glustervolume: allowed = "*", received addr = "10.14.81.72"
[2024-05-27 11:07:48.998946 +0000] I [login.c:110:gf_auth] 0-auth/login: allowed user names: 8cf1ac60-2527-466b-aefa-b92c1a0280a3
[2024-05-27 11:07:48.998960 +0000] I [MSGID: 115029] [server-handshake.c:647:server_setvolume] 0-glustervolume01-server: accepted client from CTX_ID:c3f7cef0-b31c-4112-b7ff-51780a7bd0d8-GRAPH_ID:0-PID:19372-HOST:servclstfs02-PC_NAME:glustervolume01-client-0-RECON_NO:-0 (version: 11.1) with subvol /data/glusterfs/glustervolume
[2024-05-27 11:07:49.014105 +0000] W [socket.c:754:__socket_rwv] 0-tcp.glustervolume01-server: readv on 10.14.81.72:49142 failed (??? ????????? ??????)
[2024-05-27 11:07:49.014159 +0000] I [MSGID: 115036] [server.c:495:server_rpc_notify] 0-glustervolume01-server: disconnecting connection [{client-uid=CTX_ID:c3f7cef0-b31c-4112-b7ff-51780a7bd0d8-GRAPH_ID:0-PID:19372-HOST:servclstfs02 -PC_NAME:glustervolume01-client-0-RECON_NO:-0}]
[2024-05-27 11:07:49.014343 +0000] I [MSGID: 101054] [client_t.c:375:gf_client_unref] 0-glustervolume01-server: Shutting down connection CTX_ID:c3f7cef0-b31c-4112-b7ff-51780a7bd0d8-GRAPH_ID:0-PID:19372-HOST:servclstfs02 -PC_NAME:glustervolume01-client-0-RECON_NO:-0
[2024-05-27 11:08:41.865887 +0000] I [addr.c:54:compare_addr_and_update] 0-/data/glusterfs/glustervolume: allowed = "*", received addr = "10.14.81.73"
[2024-05-27 11:08:41.865944 +0000] I [login.c:110:gf_auth] 0-auth/login: allowed user names: 8cf1ac60-2527-466b-aefa-b92c1a0280a3
[2024-05-27 11:08:41.865952 +0000] I [MSGID: 115029] [server-handshake.c:647:server_setvolume] 0-glustervolume01-server: accepted client from CTX_ID:7ee9e7cf-d958-4cc3-bdfa-78d0522a4a0c-GRAPH_ID:0-PID:14920-HOST:servclstarb01 -PC_NAME:glustervolume01-client-0-RECON_NO:-0 (version: 11.1) with subvol /data/glusterfs/glustervolume
[2024-05-27 11:08:41.879735 +0000] W [socket.c:754:__socket_rwv] 0-tcp.glustervolume01-server: readv on 10.14.81.73:49144 failed (??? ????????? ??????)
[2024-05-27 11:08:41.879800 +0000] I [MSGID: 115036] [server.c:495:server_rpc_notify] 0-glustervolume01-server: disconnecting connection [{client-uid=CTX_ID:7ee9e7cf-d958-4cc3-bdfa-78d0522a4a0c-GRAPH_ID:0-PID:14920-HOST:servclstarb01 -PC_NAME:glustervolume01-client-0-RECON_NO:-0}]
[2024-05-27 11:08:41.879954 +0000] I [MSGID: 101054] [client_t.c:375:gf_client_unref] 0-glustervolume01-server: Shutting down connection CTX_ID:7ee9e7cf-d958-4cc3-bdfa-78d0522a4a0c-GRAPH_ID:0-PID:14920-HOST:servclstarb01 -PC_NAME:glustervolume01-client-0-RECON_NO:-0

Mandatory info: - The output of the gluster volume info command:

Volume Name: glustervolume01
Type: Replicate
Volume ID: 3f7d3e1e-4403-4448-b081-40acfb52f05a
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: servclstfs01:/data/glusterfs/glustervolume
Brick2: servclstfs02:/data/glusterfs/glustervolume
Brick3: servclstarb01:/data/glusterfs/glustervolume (arbiter)
Options Reconfigured:
server.event-threads: 4
client.event-threads: 4
performance.readdir-ahead: off
performance.nl-cache-timeout: 600
performance.nl-cache: on
network.inode-lru-limit: 200000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
performance.cache-samba-metadata: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
storage.batch-fsync-delay-usec: 0
performance.parallel-readdir: off
features.inode-quota: off
features.quota: off
network.ping-timeout: 8
transport.address-family: inet
performance.client-io-threads: off
performance.io-thread-count: 8

- The output of the gluster volume status command:

Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick servclstfs01:/data/glusterfs/glustervolume 55623 0 Y 5080
Brick servclstfs02:/data/glusterfs/glustervolume 53658 0 Y 5555
Brick servclstarb01:/data/glusterfs/glustervolume 52826 0 Y 5060
Self-heal Daemon on localhost N/A N/A Y 5099
Self-heal Daemon on servclstarb01 N/A N/A Y 5089
Self-heal Daemon on servclstfs02 N/A N/A Y 5584

Task Status of Volume glustervolume01
------------------------------------------------------------------------------
There are no active volume tasks

- The output of the gluster volume heal command:

Brick servclstfs01:/data/glusterfs/glustervolume
Status: Connected
Number of entries: 0

Brick servclstfs02:/data/glusterfs/glustervolume
Status: Connected
Number of entries: 0

Brick servclstarb01:/data/glusterfs/glustervolume
Status: Connected
Number of entries: 0

**- Provide logs present on following locations of client and server nodes - /var/log/glusterfs/

2024-05-26 23:01:03.774591 +0000] I [MSGID: 106496] [glusterd-handshake.c:923:__server_getspec] 0-management: Received mount request for volume glustervolume01
[2024-05-26 23:01:15.058975 +0000] I [MSGID: 106499] [glusterd-handler.c:4536:__glusterd_handle_status_volume] 0-management: Received status volume req for volume glustervolume01

**- Is there any crash ? Provide the backtrace and coredump

Additional info:

- The operating system / glusterfs version: Astra Linux 1.7.5 kernel version 6.1.50 Note: Please hide any confidential data which you don't want to share in public like IP address, file name, hostname or any other configuration

Captd65 avatar May 27 '24 11:05 Captd65

Was quota running on those nodes?

pranithk avatar Jan 22 '25 00:01 pranithk

Yes, it was initially turned on for tests, but then it was turned off.

Captd65 avatar Nov 17 '25 14:11 Captd65