couchdb
couchdb copied to clipboard
cluster couchdb unstable (fabric_worker_timeout)
Hi everyone. I present to you a problem that we are randomly encountering. this is a 3 node couchdb cluster. one of them (always the same) apparently for no reason, signals countless "fabric_worker_timeout" and is finally excluded from the cluster until the restart. Below are the logs and configuration. thanks to everyone who wants to contribute
configuration:
3 node cluster:
host440 --> S.O. Ubuntu 18.04.5 LTS (Bionic Beaver) (kernel 5.4.0-1040-azure)
host430 --> S.O. Ubuntu 18.04.5 LTS (Bionic Beaver) (kernel 5.4.0-1064-azure)
host410 --> S.O. Ubuntu 18.04.4 LTS (Bionic Beaver) (kernel 5.4.0-1095-azure) (crashed node)
[vendor]
name = The Apache Software Foundation
[couchdb]
uuid =
database_dir = ./data
view_index_dir = ./data
max_dbs_open = 500
file_compression = snappy
attachment_stream_buffer_size = 4096
default_security = admin_only
changes_doc_ids_optimization_threshold = 100
default_engine = couch
[purge]
users_db_security_editable = false
[couchdb_engines]
couch = couch_bt_engine
[process_priority]
[cluster]
q=2
n=3
[chttpd]
port = 5984
bind_address = 127.0.0.1
backlog = 512
socket_options = [{sndbuf, 262144}, {nodelay, true}]
server_options = [{recbuf, undefined}]
require_valid_user = false
prefer_minimal = Cache-Control, Content-Length, Content-Range, Content-Type, ETag, Server, Transfer-Encoding, Vary
max_db_number_for_dbs_info_req = 100
[couch_peruser]
enable = false
delete_dbs = false
database_prefix = userdb-
[httpd]
port = 5986
bind_address = 127.0.0.1
authentication_handlers = {couch_httpd_auth, cookie_authentication_handler}, {couch_httpd_auth, default_authentication_handler}
secure_rewrites = true
allow_jsonp = false
socket_options = [{sndbuf, 262144}]
enable_cors = false
enable_xframe_options = false
[ssl]
port = 6984
[couch_httpd_auth]
authentication_db = _users
authentication_redirect = /_utils/session.html
require_valid_user = false
[csp]
enable = true
[cors]
credentials = false
[x_frame_options]
[native_query_servers]
[query_server_config]
reduce_limit = true
os_process_limit = 100
[mango]
[indexers]
couch_mrview = true
[feature_flags]
partitioned||* = true
[uuids]
algorithm = sequential
utc_id_suffix =
# Maximum number of UUIDs retrievable from /_uuids in a single request
max_count = 1000
[attachments]
compressible_types = text/*, application/javascript, application/json, application/xml
[replicator]
startup_jitter = 5000
max_jobs = 500
interval = 60000
max_churn = 20
worker_processes = 4
worker_batch_size = 500
http_connections = 20
connection_timeout = 30000
retries_per_request = 5
socket_options = [{keepalive, true}, {nodelay, false}]
verify_ssl_certificates = false
ssl_certificate_max_depth = 3
[log]
level = info
writer = stderr
[stats]
[smoosh.ratio_dbs]
min_priority = 2.0
[smoosh.ratio_views]
min_priority = 2.0
[ioq]
concurrency = 10
ratio = 0.01
[ioq.bypass]
os_process = true
read = true
write = true
view_update = true
shard_sync = false
compaction = false
[dreyfus]
[reshard]
--------------------------------------------------------------------
local.ini:
[couchdb]
[couch_peruser]
[chttpd]
[httpd]
[couch_httpd_auth]
[ssl]
[vhosts]
[admins]
[log]
file = /couchdblog/couchdb.log
level = info
-----------------------------------------------
logs host410:
[notice] 2023-09-22T07:27:37.992397Z [email protected] <0.28252.6837> 4d4c63b407 XXXXXXXX.xxxxx.com:5984 000.000.158.66 JJJJJJJJJJ GET /xxxxxxxxxxx_xxxxxxxxxxx-store/_design/sync/_view/pull?start_key=%222023-09-21T11%3A43%3A10.0900665Z%22&limit=2&include_docs=true 200 ok 6
[notice] 2023-09-22T07:27:38.108163Z [email protected] <0.32169.6869> 9d742bba86 XXXXXXXX.xxxxx.com:5984 000.000.158.66 JJJJJJJJJJ GET /xxxxxxxxxxx-YYYYYYYYY-data/_changes?include_docs=true&since=90688-g1AAAADteJzLYWBgYMpgTmFQSc4vTc5ISXJITqwqLUotLiwqMDEx0EvOy0jOSUnM1cvJT07MyQGpTmRIqv___39WBnMSA8OW37lAMXbLtMREQ0tTIo0hzb48FiDJ0ACk_sOtXesKtjbJIi0x1TKNSNOyAErHTyo&limit=1 200 ok 15
[info] 2023-09-22T07:27:38.132250Z [email protected] <0.23148.6894> -------- Starting index update for db: shards/80000000-ffffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339 idx: _design/keys
[info] 2023-09-22T07:27:38.132420Z [email protected] <0.10215.6866> -------- Starting index update for db: shards/80000000-ffffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339 idx: _design/crm365
[info] 2023-09-22T07:27:38.132533Z [email protected] <0.10286.6888> -------- Starting index update for db: shards/80000000-ffffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339 idx: _design/qlikViewsCommonData
[notice] 2023-09-22T07:27:38.165317Z [email protected] <0.4683.6913> 2d9d36ea72 XXXXXXXX.xxxxx.com:5984 000.000.158.66 JJJJJJJJJJ GET /xxxxxxxxxxx-YYYYYYYYY-store/_design/sync/_view/pullCount?start_key=%222023-09-19T10%3A20%3A23.6650000Z%22 200 ok 4
[notice] 2023-09-22T07:27:38.166509Z [email protected] <0.28839.6862> 30a88e797e XXXXXXXX.xxxxx.com:5984 000.000.158.66 JJJJJJJJJJ GET /xxxxxxxxxxx-YYYYYYYYY-store/_design/sync/_view/pull?start_key=%222023-09-19T10%3A20%3A23.6650000Z%22&limit=2&include_docs=true 200 ok 5
[info] 2023-09-22T07:27:38.280964Z [email protected] <0.14685.6879> -------- Starting index update for db: shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339 idx: _design/keys
[info] 2023-09-22T07:27:38.281173Z [email protected] <0.5183.6871> -------- Starting index update for db: shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339 idx: _design/crm365
[info] 2023-09-22T07:27:38.281236Z [email protected] <0.9142.6883> -------- Starting index update for db: shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339 idx: _design/qlikViewsCommonData
[notice] 2023-09-22T07:27:38.420903Z [email protected] <0.2955.6877> 7135637ce5 XXXXXXXX.xxxxx.com:5984 000.000.158.66 JJJJJJJJJJ GET /xxxxxxxxxxl-YYYYYYYYY-store/_design/sync/_view/pullCount?start_key=%222023-09-22T06%3A35%3A46.5070000Z%22 200 ok 19
[info] 2023-09-22T07:27:38.672348Z [email protected] <0.10286.6888> -------- Index update finished for db: shards/80000000-ffffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339 idx: _design/qlikViewsCommonData
[info] 2023-09-22T07:27:38.730669Z [email protected] <0.23148.6894> -------- Index update finished for db: shards/80000000-ffffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339 idx: _design/keys
[notice] 2023-09-22T07:27:38.745247Z [email protected] <0.28451.6848> 0fb95ccdf2 XXXXXXXX.xxxxx.com:5984 000.000.158.4 JJJJJJJJJJ POST /xxxxxxxxxx-YYYYYYYYY/_all_docs?include_docs=true 200 ok 13
[info] 2023-09-22T07:27:38.755594Z [email protected] <0.10215.6866> -------- Index update finished for db: shards/80000000-ffffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339 idx: _design/crm365
[info] 2023-09-22T07:27:38.922129Z [email protected] <0.9142.6883> -------- Index update finished for db: shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339 idx: _design/qlikViewsCommonData
[info] 2023-09-22T07:27:38.957962Z [email protected] <0.14685.6879> -------- Index update finished for db: shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339 idx: _design/keys
[info] 2023-09-22T07:27:38.970011Z [email protected] <0.5183.6871> -------- Index update finished for db: shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339 idx: _design/crm365
[notice] 2023-09-22T07:27:39.410205Z [email protected] <0.4969.6869> 880d8c7a31 XXXXXXXX.xxxxx.com:5984 000.000.158.66 JJJJJJJJJJ GET /XXXXXXXXX-YYYYYYYYY-data/_changes?include_docs=true&since=137250-g1AAAADteJzLYWBgYMpgTmFQSc4vTc5ISXJITqwqLUotLiwqMDE20EvOy0jOSUnM1cvJT07MyQGpTmRIqv___39WBnMSA6P0z1ygGHtyckpiqqk5kcaQZl8eC5BkaABS_2HWMvzRBFtrZmiSlGhpRKRpWQALtE5_&limit=1 200 ok 11
[notice] 2023-09-22T07:27:39.426000Z [email protected] <0.6036.6891> e5b57b4db7 XXXXXXXX.xxxxx.com:5984 000.000.158.66 JJJJJJJJJJ GET /XXXXXXXX-YYYYYYYYY-data/_changes?since=137250-g1AAAADteJzLYWBgYMpgTmFQSc4vTc5ISXJITqwqLUotLiwqMDE20EvOy0jOSUnM1cvJT07MyQGpTmRIqv___39WBnMSA6P0z1ygGHtyckpiqqk5kcaQZl8eC5BkaABS_2HWMvzRBFtrZmiSlGhpRKRpWQALtE5_ 200 ok 9
[notice] 2023-09-22T07:27:39.469679Z [email protected] <0.19949.6869> 85f9d1aa9d XXXXXXXX.xxxxx.com:5984 000.000.158.66 JJJJJJJJJJ GET /xxxxxxxx_xxxxxxxxxxx-store/_design/sync/_view/pull?start_key=%222023-09-21T17%3A42%3A46.0454235Z%22&limit=2&include_docs=true 200 ok 7
[notice] 2023-09-22T07:27:39.667451Z [email protected] <0.3686.6892> 61172e29d1 XXXXXXXX.xxxxx.com:5984 000.000.158.4 jjjjjjjjjj GET /xxxxxxxxx-YYYYYYYYY/c3f34b2d-f821-43e7-813c-5b9f7f3eab8e 200 ok 11
[error] 2023-09-22T07:28:39.683094Z [email protected] <0.23080.6890> a3de97af73 fabric_worker_timeout open_doc,'[email protected]',<<"shards/80000000-ffffffff/_users.1606227611">>
[error] 2023-09-22T07:28:39.683423Z [email protected] <0.23080.6890> a3de97af73 fabric_worker_timeout open_doc,'[email protected]',<<"shards/80000000-ffffffff/_users.1606227611">>
[error] 2023-09-22T07:28:40.114013Z [email protected] <0.32575.6885> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxxxx_xxxxxxxxxxx-store.1624883181">>
[error] 2023-09-22T07:28:40.114018Z [email protected] <0.22781.6879> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxxxx_xxxxxxxxxxx-store.1624883181">>
[error] 2023-09-22T07:28:40.114090Z [email protected] <0.22781.6879> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxxxx_xxxxxxxxxxx-store.1624883181">>
[error] 2023-09-22T07:28:40.114101Z [email protected] <0.32575.6885> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxxxx_xxxxxxxxxxx-store.1624883181">>
[error] 2023-09-22T07:28:41.085053Z [email protected] <0.16232.6867> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx_xxxxxxxxxxx-store.1661792183">>
[error] 2023-09-22T07:28:41.085144Z [email protected] <0.16232.6867> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx_xxxxxxxxxxx-store.1661792183">>
[error] 2023-09-22T07:28:41.085961Z [email protected] <0.9368.6911> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx_xxxxxxxxxxx-store.1661792183">>
[error] 2023-09-22T07:28:41.086001Z [email protected] <0.9368.6911> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx_xxxxxxxxxxx-store.1661792183">>
[error] 2023-09-22T07:28:41.310101Z [email protected] <0.31151.6895> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-store.1610742106">>
[error] 2023-09-22T07:28:41.310188Z [email protected] <0.31151.6895> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-store.1610742106">>
[error] 2023-09-22T07:28:41.311104Z [email protected] <0.5060.6887> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-store.1610742106">>
[error] 2023-09-22T07:28:41.311149Z [email protected] <0.5060.6887> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-store.1610742106">>
[error] 2023-09-22T07:28:41.610017Z [email protected] <0.27352.6879> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1676992264">>
[error] 2023-09-22T07:28:41.610021Z [email protected] <0.22928.6893> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1676992264">>
[error] 2023-09-22T07:28:41.610087Z [email protected] <0.22928.6893> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1676992264">>
[error] 2023-09-22T07:28:41.610086Z [email protected] <0.27352.6879> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1676992264">>
[error] 2023-09-22T07:28:42.052079Z [email protected] <0.22197.6887> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx-YYYYYYYYY.1610806297">>
[error] 2023-09-22T07:28:42.052164Z [email protected] <0.22197.6887> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx-YYYYYYYYY.1610806297">>
[warning] 2023-09-22T07:28:42.052435Z [email protected] <0.19983.6900> -------- Failed to get group_pid for "upgrade_views" <<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY.1610806297">> <<"_design/QlikViews">>: timeout
[error] 2023-09-22T07:28:42.570144Z [email protected] <0.28181.6921> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-store.1610764356">>
[error] 2023-09-22T07:28:42.570236Z [email protected] <0.28181.6921> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-store.1610764356">>
[warning] 2023-09-22T07:28:42.570499Z [email protected] <0.26664.6885> -------- Failed to get group_pid for "upgrade_views" <<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-store.1610764356">> <<"_design/rmi">>: timeout
[error] 2023-09-22T07:28:42.660038Z [email protected] <0.1969.6843> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-data.1610752889">>
[error] 2023-09-22T07:28:42.660116Z [email protected] <0.1969.6843> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-data.1610752889">>
[error] 2023-09-22T07:28:42.679132Z [email protected] <0.18090.6868> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx-YYYYYYYYY-store.1610764356">>
[error] 2023-09-22T07:28:42.679193Z [email protected] <0.18090.6868> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx-YYYYYYYYY-store.1610764356">>
[warning] 2023-09-22T07:28:42.679331Z [email protected] <0.9142.6895> -------- Failed to get group_pid for "upgrade_views" <<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-store.1610764356">> <<"_design/eventTypeIndex">>: timeout
[error] 2023-09-22T07:28:42.875018Z [email protected] <0.8427.6884> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx-YYYYYYYYY.1610766612">>
[error] 2023-09-22T07:28:42.875105Z [email protected] <0.8427.6884> -------- Error checking security objects for xxxxxxxxx-YYYYYYYYY :: {error,timeout}
[error] 2023-09-22T07:28:42.875222Z [email protected] <0.7152.6891> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-store.1610764356">>
[error] 2023-09-22T07:28:42.875263Z [email protected] <0.7152.6891> -------- Error checking security objects for xxxxxxxxx-YYYYYYYYY-store :: {error,timeout}
[error] 2023-09-22T07:28:42.875921Z [email protected] <0.7143.6855> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-store.1610764356">>
[error] 2023-09-22T07:28:42.875988Z [email protected] <0.7143.6855> -------- Error checking security objects for xxxxxxxxx-YYYYYYYYY-store :: {error,timeout}
[error] 2023-09-22T07:28:42.876188Z [email protected] <0.12119.6899> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx-YYYYYYYYY.1610766612">>
[error] 2023-09-22T07:28:42.876243Z [email protected] <0.12119.6899> -------- Error checking security objects for xxxxxxxxx-YYYYYYYYY :: {error,timeout}
[error] 2023-09-22T07:28:43.202106Z [email protected] <0.19541.6913> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-data.1663603160">>
[error] 2023-09-22T07:28:43.202176Z [email protected] <0.19541.6913> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-data.1663603160">>
[error] 2023-09-22T07:28:43.377020Z [email protected] <0.1421.6852> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339">>
[error] 2023-09-22T07:28:43.377019Z [email protected] <0.6036.6893> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339">>
[error] 2023-09-22T07:28:43.377019Z [email protected] <0.6516.6893> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339">>
[error] 2023-09-22T07:28:43.377087Z [email protected] <0.6516.6893> -------- Error checking security objects for xxxxxxxxxxx-YYYYYYYYY-data :: {error,timeout}
[error] 2023-09-22T07:28:43.377089Z [email protected] <0.1421.6852> -------- Error checking security objects for xxxxxxxxxxx-YYYYYYYYY-data :: {error,timeout}
[error] 2023-09-22T07:28:43.377088Z [email protected] <0.6036.6893> -------- Error checking security objects for xxxxxxxxxxx-YYYYYYYYY-data :: {error,timeout}
[error] 2023-09-22T07:28:43.377165Z [email protected] <0.11217.6895> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339">>
[error] 2023-09-22T07:28:43.377214Z [email protected] <0.11217.6895> -------- Error checking security objects for xxxxxxxxxxx-YYYYYYYYY-data :: {error,timeout}
[error] 2023-09-22T07:28:43.553055Z [email protected] <0.3215.6870> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-data.1611910875">>
[error] 2023-09-22T07:28:43.553136Z [email protected] <0.3215.6870> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-data.1611910875">>
[error] 2023-09-22T07:28:43.553957Z [email protected] <0.23209.6871> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-data.1611910875">>
[error] 2023-09-22T07:28:43.554019Z [email protected] <0.23209.6871> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-data.1611910875">>
[error] 2023-09-22T07:28:43.678035Z [email protected] <0.29740.6885> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339">>
[error] 2023-09-22T07:28:43.678123Z [email protected] <0.29740.6885> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339">>
[warning] 2023-09-22T07:28:43.678293Z [email protected] <0.9559.6904> -------- Failed to get group_pid for "upgrade_views" <<"shards/80000000-ffffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339">> <<"_design/qlikViewsCommonData">>: timeout
[error] 2023-09-22T07:28:43.732077Z [email protected] <0.32314.6900> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339">>
[error] 2023-09-22T07:28:43.732171Z [email protected] <0.32314.6900> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339">>
[warning] 2023-09-22T07:28:43.732337Z [email protected] <0.8810.6892> -------- Failed to get group_pid for "upgrade_views" <<"shards/80000000-ffffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339">> <<"_design/keys">>: timeout
[error] 2023-09-22T07:28:43.758042Z [email protected] <0.5407.6881> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339">>
[error] 2023-09-22T07:28:43.758117Z [email protected] <0.5407.6881> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339">>
[warning] 2023-09-22T07:28:43.758194Z [email protected] <0.16313.6907> -------- Failed to get group_pid for "upgrade_views" <<"shards/80000000-ffffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339">> <<"_design/crm365">>: timeout
[error] 2023-09-22T07:28:43.924122Z [email protected] <0.16596.6891> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339">>
[error] 2023-09-22T07:28:43.924215Z [email protected] <0.16596.6891> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339">>
[warning] 2023-09-22T07:28:43.924314Z [email protected] <0.7358.6885> -------- Failed to get group_pid for "upgrade_views" <<"shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339">> <<"_design/qlikViewsCommonData">>: timeout
[error] 2023-09-22T07:28:43.960068Z [email protected] <0.6482.6895> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339">>
- - - - -
[error] 2023-09-22T07:29:12.840359Z [email protected] <0.109.0> -------- ** Node '[email protected]' not responding **
** Removing (timedout) connection **
[error] 2023-09-22T07:29:12.840960Z [email protected] <0.111.0> -------- ** Node '[email protected]' not responding **
** Removing (timedout) connection **
[notice] 2023-09-22T07:29:12.843431Z [email protected] <0.23080.6890> a3de97af73 XXXXXXXX.xxxxx.com:5984 000.000.158.4 kkkkkkkkkkk GET /xxxxxxxxx_xxxxxxxxxxx-store/- 404 ok 93161
[notice] 2023-09-22T07:29:12.844397Z [email protected] <0.275.0> -------- rexi_server_mon : cluster unstable
[notice] 2023-09-22T07:29:12.844450Z [email protected] <0.275.0> -------- rexi_server_mon : cluster unstable
[notice] 2023-09-22T07:29:12.844491Z [email protected] <0.281.0> -------- rexi_server_mon : cluster unstable
[notice] 2023-09-22T07:29:12.844635Z [email protected] <0.281.0> -------- rexi_server_mon : cluster unstable
[error] 2023-09-22T07:29:12.847877Z [email protected] emulator -------- Error in process <0.16294.6892> on node '[email protected]' with exit value:
{{rexi_DOWN,{'[email protected]',noconnection}},[{mem3_rpc,rexi_call,3,[{file,"src/mem3_rpc.erl"},{line,394}]},{mem3_rep,calculate_start_seq,3,[{file,"src/mem3_rep.erl"},{line,402}]},{maps,'-map/2-lc$^0/1-0-',2,[{file,"maps.erl"},{line,232}]},{maps,map,2,[{file,"maps.erl"},{line,232}]},{mem3_rep,calculate_start_seq_multi,1,[{file,"src/mem3_rep.erl"},{line,390}]},{mem3_rep,repl,1,[{file,"src/mem3_rep.erl"},{line,292}]},{mem3_rep,go,1,[{file,"src/mem3_rep.erl"},{line,111}]},{mem3_sync,'-start_push_replication/1-fun-0-',2,[{file,"src/mem3_sync.erl"},{line,212}]}]}
[warning] 2023-09-22T07:29:12.848269Z [email protected] <0.344.0> -------- mem3_sync shards/80000000-ffffffff/xxxxxxxxxxx-YYYYYYYYY-data.1663606339 [email protected] {{rexi_DOWN,{'[email protected]',noconnection}},[{mem3_rpc,rexi_call,3,[{file,[115,114,99,47,109,101,109,51,95,114,112,99,46,101,114,108]},{line,394}]},{mem3_rep,calculate_start_seq,3,[{file,[115,114,99,47,109,101,109,51,95,114,101,112,46,101,114,108]},{line,402}]},{maps,'-map/2-lc$^0/1-0-',2,[{file,[109,97,112,115,46,101,114,108]},{line,232}]},{maps,map,2,[{file,[109,97,112,115,46,101,114,108]},{line,232}]},{mem3_rep,calculate_start_seq_multi,1,[{file,[115,114,99,47,109,101,109,51,95,114,101,112,46,101,114,108]},{line,390}]},{mem3_rep,repl,1,[{file,[115,114,99,47,109,101,109,51,95,114,101,112,46,101,114,108]},{line,292}]},{mem3_rep,go,1,[{file,[115,114,99,47,109,101,109,51,95,114,101,112,46,101,114,108]},{line,111}]},{mem3_sync,'-start_push_replication/1-fun-0-',2,[{file,[115,114,99,47,109,101,109,51,95,115,121,110,99,46,101,114,108]},{line,212}]}]}
[error] 2023-09-22T07:29:12.848335Z [email protected] emulator -------- Error in process <0.32653.6885> on node '[email protected]' with exit value:
{{rexi_DOWN,{'[email protected]',noconnection}},[{mem3_rpc,rexi_call,3,[{file,"src/mem3_rpc.erl"},{line,394}]},{mem3_rep,calculate_start_seq,3,[{file,"src/mem3_rep.erl"},{line,402}]},{maps,'-map/2-lc$^0/1-0-',2,[{file,"maps.erl"},{line,232}]},{maps,map,2,[{file,"maps.erl"},{line,232}]},{mem3_rep,calculate_start_seq_multi,1,[{file,"src/mem3_rep.erl"},{line,390}]},{mem3_rep,repl,1,[{file,"src/mem3_rep.erl"},{line,292}]},{mem3_rep,go,1,[{file,"src/mem3_rep.erl"},{line,111}]},{mem3_sync,'-start_push_replication/1-fun-0-',2,[{file,"src/mem3_sync.erl"},{line,212}]}]}
[error] 2023-09-22T07:29:12.848484Z [email protected] emulator -------- Error in process <0.7845.6894> on node '[email protected]' with exit value:
{{rexi_DOWN,{'[email protected]',noconnection}},[{mem3_rpc,rexi_call,3,[{file,"src/mem3_rpc.erl"},{line,394}]},{mem3_rep,calculate_start_seq,3,[{file,"src/mem3_rep.erl"},{line,402}]},{maps,'-map/2-lc$^0/1-0-',2,[{file,"maps.erl"},{line,232}]},{maps,map,2,[{file,"maps.erl"},{line,232}]},{mem3_rep,calculate_start_seq_multi,1,[{file,"src/mem3_rep.erl"},{line,390}]},{mem3_rep,repl,1,[{file,"src/mem3_rep.erl"},{line,292}]},{mem3_rep,go,1,[{file,"src/mem3_rep.erl"},{line,111}]},{mem3_sync,'-start_push_replication/1-fun-0-',2,[{file,"src/mem3_sync.erl"},{line,212}]}]}
[error] 2023-09-22T07:29:12.848643Z [email protected] emulator -------- Error in process <0.20330.6882> on node '[email protected]' with exit value:
{{rexi_DOWN,{'[email protected]',noconnection}},[{mem3_rpc,rexi_call,3,[{file,"src/mem3_rpc.erl"},{line,394}]},{mem3_rep,calculate_start_seq,3,[{file,"src/mem3_rep.erl"},{line,402}]},{maps,'-map/2-lc$^0/1-0-',2,[{file,"maps.erl"},{line,232}]},{maps,map,2,[{file,"maps.erl"},{line,232}]},{mem3_rep,calculate_start_seq_multi,1,[{file,"src/mem3_rep.erl"},{line,390}]},{mem3_rep,repl,1,[{file,"src/mem3_rep.erl"},{line,292}]},{mem3_rep,go,1,[{file,"src/mem3_rep.erl"},{line,111}]},{mem3_sync,'-start_push_replication/1-fun-0-',2,[{file,"src/mem3_sync.erl"},{line,212}]}]}
[error] 2023-09-22T07:29:12.848845Z [email protected] emulator -------- Error in process <0.26222.6868> on node '[email protected]' with exit value:
{{rexi_DOWN,{'[email protected]',noconnection}},[{mem3_rpc,rexi_call,3,[{file,"src/mem3_rpc.erl"},{line,394}]},{mem3_rep,calculate_start_seq,3,[{file,"src/mem3_rep.erl"},{line,402}]},{maps,'-map/2-lc$^0/1-0-',2,[{file,"maps.erl"},{line,232}]},{maps,map,2,[{file,"maps.erl"},{line,232}]},{mem3_rep,calculate_start_seq_multi,1,[{file,"src/mem3_rep.erl"},{line,390}]},{mem3_rep,repl,1,[{file,"src/mem3_rep.erl"},{line,292}]},{mem3_rep,go,1,[{file,"src/mem3_rep.erl"},{line,111}]},{mem3_sync,'-start_push_replication/1-fun-0-',2,[{file,"src/mem3_sync.erl"},{line,212}]}]}
[error] 2023-09-22T07:29:12.849003Z [email protected] emulator -------- Error in process <0.27307.6863> on node '[email protected]' with exit value:
{{rexi_DOWN,{'[email protected]',noconnection}},[{mem3_rpc,rexi_call,3,[{file,"src/mem3_rpc.erl"},{line,394}]},{mem3_rep,calculate_start_seq,3,[{file,"src/mem3_rep.erl"},{line,402}]},{maps,'-map/2-lc$^0/1-0-',2,[{file,"maps.erl"},{line,232}]},{maps,map,2,[{file,"maps.erl"},{line,232}]},{mem3_rep,calculate_start_seq_multi,1,[{file,"src/mem3_rep.erl"},{line,390}]},{mem3_rep,repl,1,[{file,"src/mem3_rep.erl"},{line,292}]},{mem3_rep,go,1,[{file,"src/mem3_rep.erl"},{line,111}]},{mem3_sync,'-start_push_replication/1-fun-0-',2,[{file,"src/mem3_sync.erl"},{line,212}]}]}
[error] 2023-09-22T07:29:12.849206Z [email protected] emulator -------- Error in process <0.8719.6878> on node '[email protected]' with exit value:
{{rexi_DOWN,{'[email protected]',noconnection}},[{mem3_rpc,rexi_call,3,[{file,"src/mem3_rpc.erl"},{line,394}]},{mem3_rep,calculate_start_seq,3,[{file,"src/mem3_rep.erl"},{line,402}]},{maps,'-map/2-lc$^0/1-0-',2,[{file,"maps.erl"},{line,232}]},{maps,map,2,[{file,"maps.erl"},{line,232}]},{mem3_rep,calculate_start_seq_multi,1,[{file,"src/mem3_rep.erl"},{line,390}]},{mem3_rep,repl,1,[{file,"src/mem3_rep.erl"},{line,292}]},{mem3_rep,go,1,[{file,"src/mem3_rep.erl"},{line,111}]},{mem3_sync,'-start_push_replication/1-fun-0-',2,[{file,"src/mem3_sync.erl"},{line,212}]}]}
[error] 2023-09-22T07:29:12.849328Z [email protected] emulator -------- Error in process <0.2467.6889> on node '[email protected]' with exit value:
{{rexi_DOWN,{'[email protected]',noconnection}},[{mem3_rpc,rexi_call,3,[{file,"src/mem3_rpc.erl"},{line,394}]},{mem3_rep,calculate_start_seq,3,[{file,"src/mem3_rep.erl"},{line,402}]},{maps,'-map/2-lc$^0/1-0-',2,[{file,"maps.erl"},{line,232}]},{maps,map,2,[{file,"maps.erl"},{line,232}]},{mem3_rep,calculate_start_seq_multi,1,[{file,"src/mem3_rep.erl"},{line,390}]},{mem3_rep,repl,1,[{file,"src/mem3_rep.erl"},{line,292}]},{mem3_rep,go,1,[{file,"src/mem3_rep.erl"},{line,111}]},{mem3_sync,'-start_push_replication/1-fun-0-',2,[{file,"src/mem3_sync.erl"},{line,212}]}]}
[notice] 2023-09-22T07:29:12.850012Z [email protected] <0.274.0> -------- rexi_server : cluster unstable
[notice] 2023-09-22T07:29:12.850014Z [email protected] <0.280.0> -------- rexi_buffer : cluster unstable
[warning] 2023-09-22T07:29:12.854631Z [email protected] <0.344.0> -------- mem3_sync shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-store.1610764356 [email protected] {{rexi_DOWN,{'[email protected]',noconnection}},[{mem3_rpc,rexi_call,3,[{file,[115,114,99,47,109,101,109,51,95,114,112,99,46,101,114,108]},{line,394}]},{mem3_rep,calculate_start_seq,3,[{file,[115,114,99,47,109,101,109,51,95,114,101,112,46,101,114,108]},{line,402}]},{maps,'-map/2-lc$^0/1-0-',2,[{file,[109,97,112,115,46,101,114,108]},{line,232}]},{maps,map,2,[{file,[109,97,112,115,46,101,114,108]},{line,232}]},{mem3_rep,calculate_start_seq_multi,1,[{file,[115,114,99,47,109,101,109,51,95,114,101,112,46,101,114,108]},{line,390}]},{mem3_rep,repl,1,[{file,[115,114,99,47,109,101,109,51,95,114,101,112,46,101,114,108]},{line,292}]},{mem3_rep,go,1,[{file,[115,114,99,47,109,101,109,51,95,114,101,112,46,101,114,108]},{line,111}]},{mem3_sync,'-start_push_replication/1-fun-0-',2,[{file,[115,114,99,47,109,101,109,51,95,115,121,110,99,46,101,114,108]},{line,212}]}]}
[notice] 2023-09-22T07:29:12.856634Z [email protected] <0.280.0> -------- rexi_buffer : cluster unstable
--------------------------------------------------------------------------------
in the other nodes a connection timeout is detected for the faulty node:
(these logs were recovered in another identical case)
logs host430:
[error] 2023-09-18T14:12:54.048086Z [email protected] <0.27095.6476> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx_ie-YYYYYYYYY-data.1632235447">>
[error] 2023-09-18T14:12:54.048143Z [email protected] <0.27095.6476> -------- Error checking security objects for xxxxxxxxx_ie-YYYYYYYYY-data :: {error,timeout}
[notice] 2023-09-18T14:12:54.069795Z [email protected] <0.4394.6527> 7889fcfd4f XXXXXXXX.xxxxx.com:5984 000.000.158.4 kkkkkkkkkkk GET /xxxxxxxxx-YYYYYYYYY/6a422c42-8f33-46bc-8b78-e4dbc6b7bad7 200 ok 11
[notice] 2023-09-18T14:12:54.083987Z [email protected] <0.10204.6526> 13a99de03a XXXXXXXX.xxxxx.com:5984 000.000.158.129 kkkkkkkkkkk GET /xxxxxxxxx-YYYYYYYYY-data/_design/keys/_view/key_%5Bdoctype_code%5D?include_docs=true&reduce=false&group=false&key=%5B%22Owner%22%2
C%22fk981%40kkkkkkkk%22%5D 200 ok 3
[notice] 2023-09-18T14:12:54.120732Z [email protected] <0.10204.6526> fd649996ef XXXXXXXX.xxxxx.com:5984 000.000.158.129 kkkkkkkkkkk GET /xxxxxxxxx-YYYYYYYYY-data/_design/keys/_view/key_%5Bdoctype_market%5D?include_docs=true&reduce=false&group=false&key=%5B%22Applicat
ionSettings%22%2C%22cs_ag_fr%22%5D 200 ok 6
[notice] 2023-09-18T14:12:54.137062Z [email protected] <0.10204.6526> 50852240e0 XXXXXXXX.xxxxx.com:5984 000.000.158.129 kkkkkkkkkkk GET /xxxxxxxxx-YYYYYYYYY-data/_design/keys/_view/key_%5Bdoctype_code%5D?include_docs=true&reduce=false&group=false&key=%5B%22Owner%22%2
C%22fk981%40kkkkkkkk%22%5D 200 ok 4
[notice] 2023-09-18T14:12:54.142260Z [email protected] <0.10204.6526> 45b9ffec1d XXXXXXXX.xxxxx.com:5984 000.000.158.129 kkkkkkkkkkk GET /xxxxxxxxx-YYYYYYYYY-data/_design/keys/_view/key_%5Bdoctype_market%5D?include_docs=true&reduce=false&group=false&key=%5B%22Dealersh
ipDefaults%22%2C%22cs_ag_fr%22%5D 200 ok 3
[error] 2023-09-18T14:12:54.560244Z [email protected] <0.1421.6484> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-store.1663607213">>
[error] 2023-09-18T14:12:54.560250Z [email protected] <0.4464.6531> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx-YYYYYYYYY-store.1615278664">>
[error] 2023-09-18T14:12:54.560314Z [email protected] <0.4464.6531> -------- Error checking security objects for xxxxxxxxx-YYYYYYYYY-store :: {error,timeout}
[error] 2023-09-18T14:12:54.560318Z [email protected] <0.1421.6484> -------- Error checking security objects for xxxxxxxxx-YYYYYYYYY-store :: {error,timeout}
[error] 2023-09-18T14:12:55.061137Z [email protected] <0.14474.6506> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx-YYYYYYYYY.1663607217">>
[error] 2023-09-18T14:12:55.061197Z [email protected] <0.14474.6506> -------- Error checking security objects for xxxxxxxxx-YYYYYYYYY :: {error,timeout}
[error] 2023-09-18T14:12:55.562152Z [email protected] <0.9122.6522> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx-YYYYYYYYY-store.1610805549">>
[error] 2023-09-18T14:12:55.562213Z [email protected] <0.9122.6522> -------- Error checking security objects for xxxxxxxxx-YYYYYYYYY-store :: {error,timeout}
[error] 2023-09-18T14:12:55.616163Z [email protected] <0.2246.6530> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY.1610763551">>
[error] 2023-09-18T14:12:55.616222Z [email protected] <0.2246.6530> -------- Error checking security objects for xxxxxxxxx-YYYYYYYYY :: {error,timeout}
[error] 2023-09-18T14:12:56.753099Z [email protected] <0.10599.6516> 9c4d71f25f fabric_worker_timeout changes,'[email protected]',<<"shards/00000000-7fffffff/m_internal_xxxx_be-YYYYYYYYY-data.1676298639">>
[error] 2023-09-18T14:12:56.753173Z [email protected] <0.10599.6516> 9c4d71f25f fabric_worker_timeout changes,'[email protected]',<<"shards/80000000-ffffffff/m_internal_xxxx_be-YYYYYYYYY-data.1676298639">>
-------------------------------------------------------------------------------------
logs host440:
[notice] 2023-09-18T14:12:53.584893Z [email protected] <0.26553.6568> de51555331 XXXXXXXX.xxxxx.com:5984 000.000.158.129 kkkkkkkkkkk GET /xxxxxxxxx-YYYYYYYYY-data/_design/keys/_view/key_%5Bdoctype_code%5D?include_docs=true&reduce=false&group=false&key=%5B%22Owner%22%2
C%22fk981%40kkkkkkkk%22%5D 200 ok 4
[error] 2023-09-18T14:12:53.989756Z [email protected] <0.12478.6534> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx_ie-YYYYYYYYY-data.1632235447">>
[error] 2023-09-18T14:12:53.989855Z [email protected] <0.12478.6534> -------- Error checking security objects for xxxxxxxxx_ie-YYYYYYYYY-data :: {error,timeout}
[notice] 2023-09-18T14:12:54.064578Z [email protected] <0.15164.6581> 9b4bf04f7b XXXXXXXX.xxxxx.com:5984 000.000.158.4 kkkkkkkkkkk GET /XXXXXXXXX-YYYYYYYYY-data/_design/keys/_view/key_%5Bdoctype_code%5D?include_docs=true&reduce=false&group=false&key=%5B%22Owner%22%2C%22xw388%40kkkkkkkk%22%5D 200 ok 5
[error] 2023-09-18T14:12:54.492923Z [email protected] <0.24825.6553> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-store.1663607213">>
[error] 2023-09-18T14:12:54.493021Z [email protected] <0.24825.6553> -------- Error checking security objects for xxxxxxxxx-YYYYYYYYY-store :: {error,timeout}
[error] 2023-09-18T14:12:54.493568Z [email protected] <0.19846.6541> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx-YYYYYYYYY-store.1615278664">>
[error] 2023-09-18T14:12:54.493635Z [email protected] <0.19846.6541> -------- Error checking security objects for xxxxxxxxx-YYYYYYYYY-store :: {error,timeout}
[error] 2023-09-18T14:12:54.994886Z [email protected] <0.28372.6543> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx-YYYYYYYYY.1663607217">>
[error] 2023-09-18T14:12:54.995001Z [email protected] <0.28372.6543> -------- Error checking security objects for xxxxxxxxx-YYYYYYYYY :: {error,timeout}
[error] 2023-09-18T14:12:55.495646Z [email protected] <0.29680.6524> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY.1610763551">>
[error] 2023-09-18T14:12:55.495662Z [email protected] <0.15438.6523> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx-YYYYYYYYY-store.1610805549">>
[error] 2023-09-18T14:12:55.495718Z [email protected] <0.15438.6523> -------- Error checking security objects for xxxxxxxxx-YYYYYYYYY-store :: {error,timeout}
[error] 2023-09-18T14:12:55.495718Z [email protected] <0.29680.6524> -------- Error checking security objects for xxxxxxxxx-YYYYYYYYY :: {error,timeout}
[error] 2023-09-18T14:12:56.998821Z [email protected] <0.25052.6527> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-data.1610744998">>
[error] 2023-09-18T14:12:56.999006Z [email protected] <0.25052.6527> -------- Error checking security objects for xxxxxxxxx-YYYYYYYYY-data :: {error,timeout}
[error] 2023-09-18T14:12:56.999705Z [email protected] <0.9010.6590> -------- fabric_worker_timeout get_all_security,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx-YYYYYYYYY-data.1610744998">>
[error] 2023-09-18T14:12:56.999788Z [email protected] <0.9010.6590> -------- Error checking security objects for xxxxxxxxx-YYYYYYYYY-data :: {error,timeout}
[error] 2023-09-18T14:12:58.219781Z [email protected] <0.4871.6510> 6410b44cd3 fabric_worker_timeout changes,'[email protected]',<<"shards/00000000-7fffffff/xxxxxxxxx-YYYYYYYYY-data.1610752364">>
[error] 2023-09-18T14:12:58.219873Z [email protected] <0.4871.6510> 6410b44cd3 fabric_worker_timeout changes,'[email protected]',<<"shards/80000000-ffffffff/xxxxxxxxx-YYYYYYYYY-data.1610752364">>
[notice] 2023-09-18T14:12:58.220467Z [email protected] <0.4871.6510> 6410b44cd3 XXXXXXXX.xxxxx.com:5984 000.000.158.66 kkkkkkkkkkk GET /xxxxxxxxx-YYYYYYYYY-data/_changes?include_docs=true&since=586856-g1AAAADteJzLYWBgYMpgTmFQSc4vTc5ISXJITqwqLUotLiwqMDE00EvOy0jOSUnM1cvJT07MyQGpTmRI```
It seems like nodes are having a hard time staying connected. Is networking connectivity between cluster nodes fairly stable?
Try to upgrade to the latest Apache CouchDB as there is periodic forced re-connection module. The setting is [cluster] reconnect_interval_sec = 37
. You could lower it, for instance to 5 seconds or so.
There is also [fabric] request_timeout = 60000
(unit in milliseconds). Could try to either increase or decrease the value to see what effect it has.
Hint: you can use three backward ticks to enclose logs so they are rendered as pre-formatted text. ```
It seems like nodes are having a hard time staying connected. Is networking connectivity between cluster nodes fairly stable?
Try to upgrade to the latest Apache CouchDB as there is periodic forced re-connection module. The setting is
[cluster] reconnect_interval_sec = 37
. You could lower it, for instance to 5 seconds or so.There is also
[fabric] request_timeout = 60000
(unit in milliseconds). Could try to either increase or decrease the value to see what effect it has.Hint: you can use three backward ticks to enclose logs so they are rendered as pre-formatted text.
```
Hi Nickva, thanks for replying. the advice of three backwards is excellent. 👍 How can you explain the fact that it's always the same node that crashes?
@alessio-congedo
How can you explain the fact that it's always the same node that crashes?
Not sure I can explain that from the logs. It seems like other nodes also have connectivity issues and timeouts. I see some disconnects between 440 <-> 410:
[email protected] <0.32575.6885> --------
fabric_worker_timeout open_revs,'[email protected]',
<<"shards/80000000-ffffffff/xxxxxxxxxxx_xxxxxxxxxxx-store.1624883181">>
See if perhaps you'd also add an ioq bypass for shard_sync = false
and increase fabric request timeouts: https://github.com/apache/couchdb/blob/main/rel/overlay/etc/default.ini#L374-L376
If you're running in a kube environment check that each of the containers have the same and enough CPU and disk IO resources/bandwidth available.