glusterfs icon indicating copy to clipboard operation
glusterfs copied to clipboard

glusterfs client repeatedly tries to establish a connection with glusterfsd

Open jifengzhou opened this issue 2 years ago • 1 comments

I have a 3-replica volume cluster and found that while all bricks are online, the number of file entries that need to be synchronized keeps increasing。

Volume information and status are as follows: [root@node1 ~]# gluster volume info

Volume Name: ctest Type: Replicate Volume ID: 6cb54a68-785f-443b-b32c-e3d68a95b263 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: node1:/root/ctest Brick2: node2:/root/ctest2 Brick3: node3:/root/ctest2 Options Reconfigured: cluster.granular-entry-heal: enable cluster.quorum-type: auto performance.client-io-threads: off nfs.disable: on transport.address-family: inet storage.fips-mode-rchecksum: on [root@node1 ~]# gluster volume status Status of volume: ctest Gluster process TCP Port RDMA Port Online Pid

Brick node1:/root/ctest 49152 0 Y 9369 Brick node2:/root/ctest2 49152 0 Y 9412 Brick node3:/root/ctest2 49152 0 Y 9501 Self-heal Daemon on localhost N/A N/A Y 9380 Self-heal Daemon on node2 N/A N/A Y 9431 Self-heal Daemon on node3 N/A N/A Y 9530

Task Status of Volume ctest

There are no active volume tasks

From the above volume status, it is basically normal. But I queried through the command "gluster volume heal ctest info" and found that every once in a while, there will be new file entries that need to be synchronized.

By querying the extended attributes that need to synchronize file copies, it is found that the copies on node1 and node2 accuse the copy of node3, so it is suspected that there is a fault between the client and the brick service of node3

Use the "lsof -i" command to query the connection status of each host client and node3's glusterfsd: node1: [root@node1 ~]# ps -ef | grep glusterfs | grep fuse root 25812 1 0 23:12 ? 00:00:00 /usr/sbin/glusterfs --process-name fuse --volfile-server=node1 --volfile-id=/ctest /mnt [root@node1 ~]# lsof -i | grep 25812 glusterfs 25812 root 9u IPv4 265702 0t0 TCP node1:1019->node1:24007 (ESTABLISHED) glusterfs 25812 root 10u IPv4 269554 0t0 TCP node1:1014->node2:49152 (ESTABLISHED) glusterfs 25812 root 14u IPv4 270343 0t0 TCP node1:1016->node1:49152 (ESTABLISHED)

Note: Missing connection to node3's glusterfsd port number 49152

node2: [root@node2 ~]# ps -ef | grep glusterfs | grep fuse root 25869 1 0 23:12 ? 00:00:00 /usr/sbin/glusterfs --process-name fuse --volfile-server=node1 --volfile-id=/ctest /mnt [root@node2 ~]# lsof -i | grep 25869 glusterfs 25869 root 9u IPv4 231458 0t0 TCP node2:1023->node1:24007 (ESTABLISHED) glusterfs 25869 root 10u IPv4 227221 0t0 TCP node2:1014->node3:49152 (ESTABLISHED) glusterfs 25869 root 11u IPv4 227219 0t0 TCP node2:1015->node2:49152 (ESTABLISHED) glusterfs 25869 root 14u IPv4 230687 0t0 TCP node2:1019->node1:49152 (ESTABLISHED)

node3: [root@node3 ~]# ps -ef | grep glusterfs | grep fuse root 25919 1 0 23:12 ? 00:00:00 /usr/sbin/glusterfs --process-name fuse --volfile-server=node1 --volfile-id=/ctest /mnt [root@node3 ~]# lsof -i | grep 25919 glusterfs 25919 root 9u IPv4 228268 0t0 TCP node3:1019->node1:24007 (ESTABLISHED) glusterfs 25919 root 10u IPv4 226218 0t0 TCP node3:1013->node3:49152 (ESTABLISHED) glusterfs 25919 root 11u IPv4 228819 0t0 TCP node3:1014->node2:49152 (ESTABLISHED) glusterfs 25919 root 13u IPv4 228282 0t0 TCP node3:1016->node1:49152 (ESTABLISHED)

Through the above state, it is found that the client on node1 does not establish a socket connection with the glusterfsd on node3, while the client of node2 and node3 has established a connection with the glusterfsd of all hosts.

By setting the volume parameter diagnostics.client-log-level to TRACE, observing the client log of node1, it is found that there are many "attempting reconnect", "connect () called on transport already connected" log output, and further trace through gdb, in the following Set a breakpoint at the code: if (priv->sock >= 0) { gf_log_callingfn(this->name, GF_LOG_TRACE, "connect () called on transport " "already connected"); errno = EINPROGRESS; ret = -1; goto unlock; } Print the content of the priv pointer at this time, the data is as follows: (gdb) print *priv $3 = { { ioq = { next = 0x7f40a80ad0a0, prev = 0x7f40a80ad0a0 }, { ioq_next = 0x7f40a80ad0a0, ioq_prev = 0x7f40a80ad0a0 } }, out_lock = { __data = { __lock = 1, __count = 0, __owner = 25994, __nuusers = 1, __kind = 0, __spins = 0, __elision = 0, __list = { __prev = 0x0, __next = 0x0 } }, __size = "\001\000\000\000\000\000\000\000\212e\000\000\001", '\000' <repeats 26 times>, __align=1 }, cond_lock = { __data = { __lock = 0, __count = 0, __owner = 0, __nuusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = { __prev = 0x0, __next = 0x0 } }, __size = '\000' <repeats 39 times>, __align = 0 }, cond = { __data = { __lock = 0, ---Type to continue, or q to quit--- __futex = 0, __total_seq = 0, __wakeup_seq = 0, __woken_seq = 0, __mutex = 0x0, __nwaiters = 0, __broadcast_seq = 0 }, __size = '\000' <repeats 47 times>, __align = 0 }, thread = 0, windowsize = 0, keepalive = 1, keepaliveidle = 20, keepaliveintvl = 2, keepalivecnt = 9, timeout = 0, log_ctr = 0, shutdown_log_ctr = 0, ssl_error_required = 0, _ref = { cnt = 0, release = 0x0, data = 0x0 }, notify = { lock = { __data = { __lock = 0, __count = 0, __owner = 0, __nuusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = { __prev = 0x0, __next = 0x0 } }, __size = '\000' <repeats 39 times>, __align = 0 }, cond = { __data = { __lock = 0, __futex = 0, ---Type to continue, or q to quit--- __total_seq = 0, __wakeup_seq = 0, __woken_seq = 0, __mutex = 0x0, __nwaiters = 0, __broadcast_seq = 0 }, __size = '\000' <repeats 47 times>, __align = 0 }, in_progress = 0 }, sock = 5, idx = 1, gen = 340618, backlog = 1024, ssl_meth = 0x0, ssl_ctx = 0x0, ssl_session_id = 0, ssl_sbio = 0x0, ssl_ssl = 0x0, ssl_own_cert = 0x0, ssl_private_key = 0x0, ssl_ca_list = 0x0, crl_path = 0x0, pipe = {0, 0}, incoming = { proghdr_base_addr = 0x0, iobuf = 0x0, iobuf_size = 0, count = 0, frag = { fragcurrent = 0x0, bytes_read = 0, remaining_size = 0, vector = { iov_base = 0x0, iov_len = 0 }, pending_vector = 0x0, call_body = { request = { header_state = SP_STATE_REQUEST_HEADER_INIT, vector_state = SP_STATE_VECTORED_REQUEST_INIT, vector_sizer_state = 0 }, reply = { accept_status = 0, ---Type to continue, or q to quit--- status_state = SP_STATE_VECTORED_REPLY_STATUS_INIT, accepted_state = SP_STATE_ACCEPTED_REPLY_INIT, accepted_success_state = SP_STATE_ACCEPTED_SUCCESS_REPLY_INIT } }, simple_state = SP_STATE_SIMPLE_MSG_INIT, state = SP_STATE_RPCFRAG_INIT }, vector = {{ iov_base = 0x0, iov_len = 0 }, { iov_base = 0x0, iov_len = 0 }}, payload_vector = { iov_base = 0x0, iov_len = 0 }, iobref = 0x0, request_info = 0x0, pending_vector = 0x0, pending_count = 0, total_bytes_read = 0, ra_read = 0, ra_max = 0, ra_served = 0, ra_buf = 0x0, fraghdr = 0, complete_record = 0 '\000', msg_type = CALL, record_state = SP_STATE_NADA }, connected = 0 '\000', connect_failed = 0 '\000', bio = 0 '\000', connect_finish_log = 0 '\000', submit_log = 0 '\000', lowlat = 0 '\000', nodelay = 1 '\001', srvr_ssl = MGMT_SSL_NEVER, read_fail_log = true, ssl_enabled = false, mgmt_ssl = false, is_server = false, use_ssl = false, ssl_accepted = false, ssl_connected = false, ---Type to continue, or q to quit--- ssl_context_created = false, accepted = false } According to the above memory information, we can see the state when it exits the socket_connect function. It can be guessed from the above status that the reason is that the socket_private_t information of rpc_transport_t was not cleared when the last attempt to connect to the server was unsuccessful, because priv->sock >=0, it failed to try to reconnect again.

jifengzhou avatar Jun 27 '22 03:06 jifengzhou

@jifengzhou Did you get a chance what is the maximum time taken by a client to establish a connection with the brick process? As per gdb we can see it is showing the connected flag is 0 which means the scenario arises initially. The client should throw a message as soon as it gets an epoll event from the kernel.

mohit84 avatar Sep 19 '22 11:09 mohit84

This phenomenon has only occurred once in our environment and has not occurred since. Can no longer keep track of the previous environment state

jifengzhou avatar Sep 23 '22 07:09 jifengzhou

Thank you for your contributions. Noticed that this issue is not having any activity in last ~6 months! We are marking this issue as stale because it has not had recent activity. It will be closed in 2 weeks if no one responds with a comment here.

stale[bot] avatar May 21 '23 16:05 stale[bot]

Closing this issue as there was no update since my last update on issue. If this is an issue which is still valid, feel free to open it.

stale[bot] avatar Jun 11 '23 09:06 stale[bot]