manticoresearch Replication issue while sharding with Buddy

When we use 5 nodes and create shards with rf=3 and shards=60 we have race condition crash in random time

Here are logs that I captured from the actual case

To reproduce it, we can run with CLT

clt/clt test -d -t test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rec ghcr.io/manticoresoftware/manticoresearch:test-kit-latest; echo $?

I also added some cases to test replication but failed to reproduce the issue. Maybe actual logs will give more information about case why the cluster becomes non-primary

Oct 05 '23 07:10 donhardman

I used bash script to setup it manually

#!/usr/bin/env bash
pgrep -f searchd | xargs kill -9
sleep 2

rm -fr /var/{run,lib,log}/manticore-{1,2,3,4,5}
mkdir -p /var/{run,lib,log}/manticore-{1,2,3,4,5}

for i in 1 2 3 4 5; do
  searchd -c test/clt-tests/sharding/base/config/searchd-$i.conf --logreplication
done

mysql -h0 -P19306 -e "create cluster c"
for i in 2 3 4 5; do
  mysql -h0 -P${i}9306 -e "join cluster c at '127.0.0.1:19312'"
done

And after it when we do create table with sharding, we get issue

create table c:t shards=60 rf=3

Sample config:

common {
	plugin_dir = /usr/local/lib/manticore
	lemmatizer_base = /usr/share/manticore/morph/
}
searchd {
	listen = 19306:mysql41
	listen = 19312
	listen = 19308:http
	log = /var/log/manticore-1/searchd.log
	query_log = /var/log/manticore-1/query.log
	pid_file = /var/run/manticore-1/searchd.pid
	data_dir = /var/lib/manticore-1
	query_log_format = sphinxql
	query_log_commands = 1
	buddy_path = manticore-executor /workdir/src/main.php
}

Buddy must be used from branch fix/sharding and manually installed with all composer packages

Oct 05 '23 07:10 donhardman

I just discovered that probably this test is covered with the following test:

clt/clt test -d -t test/clt-tests/sharding/cluster/attach-table-while-joining-cluster.rec ghcr.io/manticoresoftware/manticoresearch:test-kit-latest; echo $?

https://github.com/manticoresoftware/manticoresearch/actions/runs/6416223732/job/17420158212

P.S. Not 100% of times, but often

- 0
+ ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
+ 0
––– input –––
for i in 1 2 3; do tail -n 100 /var/log/manticore-${i}/searchd.log; done
––– output –––
+ [Thu Oct  5 08:28:58.167 2023] [565] watchdog: main process 566 forked ok
+ [Thu Oct  5 08:28:58.168 2023] [566] starting daemon version '6.2.13 fc0b860f3@231002 dev (columnar 2.2.5 b8be4eb@230928) (secondary 2.2.5 b8be4eb@230928)' ...
+ [Thu Oct  5 08:28:58.169 2023] [566] listening on all interfaces for mysql, port=19306
+ [Thu Oct  5 08:28:58.169 2023] [566] listening on all interfaces for sphinx and http(s), port=19312
+ [Thu Oct  5 08:28:58.169 2023] [566] listening on all interfaces for sphinx and http(s), port=19308
+ [Thu Oct  5 08:28:58.191 2023] [571] prereading 0 tables
+ [Thu Oct  5 08:28:58.191 2023] [571] preread 0 tables in 0.000 sec
+ [Thu Oct  5 08:28:58.193 2023] [566] accepting connections
+ [Thu Oct  5 08:28:58.207 2023] [577] [BUDDY] started v1.1.1 'manticore-executor /workdir/src/main.php --listen=http://0.0.0.0:19312  --threads=12' at http://127.0.0.1:38375
+ [Thu Oct  5 08:28:58.263 2023] [573] [BUDDY] Loaded plugins:
+ [Thu Oct  5 08:28:58.263 2023] [573] [BUDDY]   core: empty-string, backup, emulate-elastic, distributed-insert, insert, select, show, cli-table, plugin, test, insert-mva, create-table
+ [Thu Oct  5 08:28:58.263 2023] [573] [BUDDY]   local:
+ [Thu Oct  5 08:28:58.263 2023] [573] [BUDDY]   extra:
+ [Thu Oct  5 08:29:02.066 2023] [692] WARNING: 3c9f4089 sending install message failed: Resource temporarily unavailable
+ [Thu Oct  5 08:29:02.566 2023] [693] WARNING: Member 2.0 (node_127.0.0.1_a_658) requested state transfer from '*any*', but it is impossible to select State Transfer donor: Resource temporarily unavailable
+ [Thu Oct  5 08:28:59.281 2023] [607] watchdog: main process 608 forked ok
+ [Thu Oct  5 08:28:59.282 2023] [608] starting daemon version '6.2.13 fc0b860f3@231002 dev (columnar 2.2.5 b8be4eb@230928) (secondary 2.2.5 b8be4eb@230928)' ...
+ [Thu Oct  5 08:28:59.282 2023] [608] listening on all interfaces for mysql, port=29306
+ [Thu Oct  5 08:28:59.282 2023] [608] listening on all interfaces for sphinx and http(s), port=29312
+ [Thu Oct  5 08:28:59.282 2023] [608] listening on all interfaces for sphinx and http(s), port=29308
+ [Thu Oct  5 08:28:59.304 2023] [613] prereading 0 tables
+ [Thu Oct  5 08:28:59.304 2023] [613] preread 0 tables in 0.000 sec
+ [Thu Oct  5 08:28:59.306 2023] [608] accepting connections
+ [Thu Oct  5 08:28:59.320 2023] [619] [BUDDY] started v1.1.1 'manticore-executor /workdir/src/main.php --listen=http://0.0.0.0:29312  --threads=12' at http://127.0.0.1:36047
+ [Thu Oct  5 08:28:59.377 2023] [611] [BUDDY] Loaded plugins:
+ [Thu Oct  5 08:28:59.377 2023] [611] [BUDDY]   core: empty-string, backup, emulate-elastic, distributed-insert, insert, select, show, cli-table, plugin, test, insert-mva, create-table
+ [Thu Oct  5 08:28:59.377 2023] [611] [BUDDY]   local:
+ [Thu Oct  5 08:28:59.377 2023] [611] [BUDDY]   extra:
+ [Thu Oct  5 08:29:02.574 2023] [710] WARNING: Member 2.0 (node_127.0.0.1_a_658) requested state transfer from '*any*', but it is impossible to select State Transfer donor: Resource temporarily unavailable
+ [Thu Oct  5 08:29:02.576 2023] [711] WARNING: replication error unknown table 't3', command 3
+ [Thu Oct  5 08:29:02.576 2023] [711] FATAL: Failed to apply trx: source: 3c9f4089-6359-11ee-86a1-c26d8a5020c7 version: 4 local: 0 state: COMMITTING flags: 65 conn_id: 3 trx_id: -1 seqnos (l: 3, g: 3, s: 2, d: 2, ts: 26959116430503837)
+ [Thu Oct  5 08:29:02.576 2023] [711] FATAL: Commit failed. Trx: 0x7febec000c20 (FATAL)
+ at /__w/manticoresearch/manticoresearch/build/galera-build/galera_populate-prefix/src/galera_populate/galera/src/replicator_smm.cpp:apply_trx():516
+ [Thu Oct  5 08:29:02.576 2023] [711] FATAL: Node consistency compromised, aborting...
+ [Thu Oct  5 08:29:07.577 2023] [711] WARNING: abort from cluster 'a'
+ [Thu Oct  5 08:29:07.577 2023] [610] FATAL: 'a' cluster after join error: , nodes '127.0.0.1:19322'
+ [Thu Oct  5 08:29:07.578 2023] [607] watchdog: main process 608 killed dirtily with signal 6, will be restarted
+ [Thu Oct  5 08:29:07.578 2023] [607] watchdog: main process 713 forked ok
+ [Thu Oct  5 08:29:07.579 2023] [713] starting daemon version '6.2.13 fc0b860f3@231002 dev (columnar 2.2.5 b8be4eb@230928) (secondary 2.2.5 b8be4eb@230928)' ...
+ [Thu Oct  5 08:29:07.579 2023] [713] listening on all interfaces for mysql, port=29306
+ [Thu Oct  5 08:29:07.579 2023] [713] listening on all interfaces for sphinx and http(s), port=29312
+ [Thu Oct  5 08:29:07.579 2023] [713] listening on all interfaces for sphinx and http(s), port=29308
+ [Thu Oct  5 08:29:07.601 2023] [714] WARNING: table 't1': disabled at the JSON config
+ [Thu Oct  5 08:29:07.601 2023] [714] WARNING: table 't1': prealloc: failed to open /var/lib/manticore-2/t1/t1.lock: No such file or directory - NOT SERVING
+ [Thu Oct  5 08:29:07.601 2023] [714] WARNING: table 't2': disabled at the JSON config
+ [Thu Oct  5 08:29:07.601 2023] [714] WARNING: table 't2': prealloc: failed to open /var/lib/manticore-2/t2/t2.lock: No such file or directory - NOT SERVING
+ [Thu Oct  5 08:29:07.601 2023] [718] binlog: replaying log /var/lib/manticore-2/binlog/binlog.001
+ [Thu Oct  5 08:29:07.601 2023] [718] binlog: replay stats: 0 commits; 0 updates, 0 reconfigure; 0 pq-add; 0 pq-delete; 0 pq-add-delete, 0 tables
+ [Thu Oct  5 08:29:07.601 2023] [718] binlog: finished replaying /var/lib/manticore-2/binlog/binlog.001; 0.0 MB in 0.000 sec
+ [Thu Oct  5 08:29:07.601 2023] [718] binlog: finished replaying total 1 in 0.000 sec
+ [Thu Oct  5 08:29:07.601 2023] [720] prereading 0 tables
+ [Thu Oct  5 08:29:07.601 2023] [720] preread 0 tables in 0.000 sec
+ [Thu Oct  5 08:29:07.604 2023] [713] accepting connections
+ [Thu Oct  5 08:29:07.618 2023] [715] [BUDDY] started v1.1.1 'manticore-executor /workdir/src/main.php --listen=http://0.0.0.0:29312  --threads=12' at http://127.0.0.1:39859
+ [Thu Oct  5 08:29:07.773 2023] [725] [BUDDY] Loaded plugins:
+ [Thu Oct  5 08:29:07.773 2023] [725] [BUDDY]   core: empty-string, backup, emulate-elastic, distributed-insert, insert, select, show, cli-table, plugin, test, insert-mva, create-table
+ [Thu Oct  5 08:29:07.773 2023] [725] [BUDDY]   local:
+ [Thu Oct  5 08:29:07.773 2023] [725] [BUDDY]   extra:
+ [Thu Oct  5 08:29:00.417 2023] [650] watchdog: main process 651 forked ok
+ [Thu Oct  5 08:29:00.417 2023] [651] starting daemon version '6.2.13 fc0b860f3@231002 dev (columnar 2.2.5 b8be4eb@230928) (secondary 2.2.5 b8be4eb@230928)' ...
+ [Thu Oct  5 08:29:00.417 2023] [651] listening on all interfaces for mysql, port=39306
+ [Thu Oct  5 08:29:00.417 2023] [651] listening on all interfaces for sphinx and http(s), port=39312
+ [Thu Oct  5 08:29:00.417 2023] [651] listening on all interfaces for sphinx and http(s), port=39308
+ [Thu Oct  5 08:29:00.438 2023] [656] prereading 0 tables
+ [Thu Oct  5 08:29:00.438 2023] [656] preread 0 tables in 0.000 sec
+ [Thu Oct  5 08:29:00.441 2023] [651] accepting connections
+ [Thu Oct  5 08:29:00.455 2023] [662] [BUDDY] started v1.1.1 'manticore-executor /workdir/src/main.php --listen=http://0.0.0.0:39312  --threads=12' at http://127.0.0.1:35263
+ [Thu Oct  5 08:29:00.511 2023] [657] [BUDDY] Loaded plugins:
+ [Thu Oct  5 08:29:00.511 2023] [657] [BUDDY]   core: empty-string, backup, emulate-elastic, distributed-insert, insert, select, show, cli-table, plugin, test, insert-mva, create-table
+ [Thu Oct  5 08:29:00.511 2023] [657] [BUDDY]   local:
+ [Thu Oct  5 08:29:00.511 2023] [657] [BUDDY]   extra:
+ [Thu Oct  5 08:29:02.566 2023] [708] WARNING: Member 2.0 (node_127.0.0.1_a_658) requested state transfer from '*any*', but it is impossible to select State Transfer donor: Resource temporarily unavailable

Oct 05 '23 08:10 donhardman

blocked by https://github.com/manticoresoftware/dev/issues/416 as that branch has a lot of changes in replication

Oct 24 '23 16:10 tomatolog

could you check current master version https://github.com/manticoresoftware/manticoresearch/commit/4083d028e0cd3b3361160a99cfddd842d0e09120 of daemon?

as for me I see the reply

stas@dev2:~/bin/rpl$ mysql -h0 -P31106 -e "create table c:t shards=2 rf=3"
ERROR 1064 (42000) at line 1: Waiting timeout exceeded.

and at the daemon log I see that something (buddy?) commits into data that got replicated at cluster. However I can not check the command sequence to make sure what is happening.

I use buddy from branch fix/sharding and manually installed with all composer packages then use this config for all daemons

query_log_commands = 1
buddy_path = manticore-executor /home/stas/buddy/src/main.php

started 3 daemons and issue

stas@dev2:~/bin/rpl$ mysql -h0 -P31106 -e "create cluster c"
stas@dev2:~/bin/rpl$ mysql -h0 -P31206 -e "join cluster c at '127.0.0.1:31112'"
stas@dev2:~/bin/rpl$ mysql -h0 -P31306 -e "join cluster c at '127.0.0.1:31112'"

then create table c:t shards=2 rf=3 finished with the Waiting timeout exceeded error

Jan 03 '24 13:01 tomatolog

I need command sequence that I could run by hand and check what cause Waiting timeout exceeded or could create a ticket that query_log_commands = 1 logs also HTTP requests along with SphinxQL

Jan 03 '24 13:01 tomatolog

As discussed, @tomatolog will try to reproduce on dev2 using https://github.com/manticoresoftware/manticoresearch/blob/tests/sharding/test/clt-tests/sharding/cluster/attach-table-while-joining-cluster.rec

Jan 04 '24 09:01 sanikolaev

I use bash script gh1485.zip extracted from the attach-table-while-joining-cluster.rec and do not see any issues sometime cluster refuses to add indexes with the error message as join operation is in progress but this way you need just to retry after join finishes. I do not see any crashes in the current master at any nodes. Here is the output that show that cluster opeable and all nodes works as usual

stas@dev2:~/bin/rpl$ ./gh1485.sh
--------------
delete cluster a
--------------

Query OK, 0 rows affected (5.50 sec)

Bye
--------------
delete cluster a
--------------

ERROR 1064 (42000) at line 1: (null)
Bye
--------------
delete cluster a
--------------

ERROR 1064 (42000) at line 1: (null)
Bye
--------------
create cluster a
--------------

Query OK, 0 rows affected (0.44 sec)

Bye
--------------
drop table if exists t1
--------------

Query OK, 0 rows affected (0.00 sec)

--------------
drop table if exists t2
--------------

Query OK, 0 rows affected (0.00 sec)

--------------
drop table if exists t3
--------------

Query OK, 0 rows affected (0.00 sec)

Bye
--------------
create table t1
--------------

Query OK, 0 rows affected (0.00 sec)

--------------
create table t2
--------------

Query OK, 0 rows affected (0.01 sec)

--------------
create table t3
--------------

Query OK, 0 rows affected (0.00 sec)

Bye
--------------
show tables
--------------

+-------+------+
| Index | Type |
+-------+------+
| t1    | rt   |
| t2    | rt   |
| t3    | rt   |
| t_s0  | rt   |
| t_s1  | rt   |
| test2 | rt   |
+-------+------+
6 rows in set (0.00 sec)

Bye
--------------
alter cluster a add t1
--------------

--------------
join cluster a at '127.0.0.1:31112'
--------------

--------------
alter cluster a add t3
--------------

--------------
alter cluster a add t2
--------------

--------------
join cluster a at '127.0.0.1:31112'
--------------

Query OK, 0 rows affected

Bye
Query OK, 0 rows affected

Bye
Query OK, 0 rows affected

Bye
Query OK, 0 rows affected

Bye
Query OK, 0 rows affected

Bye
0
--------------
show status like '%_status'
--------------

+-------------------------------+---------+
| Counter                       | Value   |
+-------------------------------+---------+
| command_status                | 12      |
| cluster_a_status              | primary |
| cluster_a_flow_control_status | OFF     |
| cluster_a_ist_receive_status  |         |
+-------------------------------+---------+
4 rows in set (0.00 sec)

Bye
--------------
show status like '%_node_state'
--------------

+----------------------+--------+
| Counter              | Value  |
+----------------------+--------+
| cluster_a_node_state | synced |
+----------------------+--------+
1 row in set (0.00 sec)

Bye
--------------
show status like '%_local_state_comment'
--------------

+-------------------------------+--------+
| Counter                       | Value  |
+-------------------------------+--------+
| cluster_a_local_state_comment | Synced |
+-------------------------------+--------+
1 row in set (0.00 sec)

Bye
--------------
show status like '%_status'
--------------

+-------------------------------+---------+
| Counter                       | Value   |
+-------------------------------+---------+
| command_status                | 10      |
| cluster_a_status              | primary |
| cluster_a_flow_control_status | OFF     |
| cluster_a_ist_receive_status  |         |
+-------------------------------+---------+
4 rows in set (0.00 sec)

Bye
--------------
show status like '%_node_state'
--------------

+----------------------+--------+
| Counter              | Value  |
+----------------------+--------+
| cluster_a_node_state | synced |
+----------------------+--------+
1 row in set (0.00 sec)

Bye
--------------
show status like '%_local_state_comment'
--------------

+-------------------------------+--------+
| Counter                       | Value  |
+-------------------------------+--------+
| cluster_a_local_state_comment | Synced |
+-------------------------------+--------+
1 row in set (0.00 sec)

Bye
--------------
show status like '%_status'
--------------

+-------------------------------+---------+
| Counter                       | Value   |
+-------------------------------+---------+
| command_status                | 11      |
| cluster_a_status              | primary |
| cluster_a_flow_control_status | OFF     |
| cluster_a_ist_receive_status  |         |
+-------------------------------+---------+
4 rows in set (0.00 sec)

Bye
--------------
show status like '%_node_state'
--------------

+----------------------+--------+
| Counter              | Value  |
+----------------------+--------+
| cluster_a_node_state | synced |
+----------------------+--------+
1 row in set (0.00 sec)

Bye
--------------
show status like '%_local_state_comment'
--------------

+-------------------------------+--------+
| Counter                       | Value  |
+-------------------------------+--------+
| cluster_a_local_state_comment | Synced |
+-------------------------------+--------+
1 row in set (0.00 sec)

Bye

Jan 04 '24 11:01 tomatolog

provide script that reproduces this issue at the master version or check the master version and close the issue

Jan 04 '24 11:01 tomatolog

root@9ade6f37ac5d:/# cat /var/log/manticore/searchd2.log
[Mon Jan  8 10:53:01.968 2024] [1122] watchdog: main process 1123 forked ok
[Mon Jan  8 10:53:01.969 2024] [1123] starting daemon version '6.2.13 ea9081bb3@240108 dev (columnar 2.2.5 1d1e432@231204) (secondary 2.2.5 1d1e432@231204) (knn 2.2.5 1d1e432@231204)' ...
[Mon Jan  8 10:53:01.969 2024] [1123] listening on 127.0.0.1:29312 for sphinx and http(s)
[Mon Jan  8 10:53:01.969 2024] [1123] listening on 127.0.0.1:29306 for mysql
[Mon Jan  8 10:53:01.969 2024] [1123] listening on 127.0.0.1:29308 for sphinx and http(s)
[Mon Jan  8 10:53:02.007 2024] [1127] prereading 0 tables
[Mon Jan  8 10:53:02.007 2024] [1127] preread 0 tables in 0.000 sec
[Mon Jan  8 10:53:02.009 2024] [1123] accepting connections
[Mon Jan  8 10:53:02.037 2024] [1130] [BUDDY] started v2.0.9 '/usr/share/manticore/modules/manticore-buddy/bin/manticore-buddy --listen=http://127.0.0.1:29312  --threads=12' at http://127.0.0.1:35245
[Mon Jan  8 10:53:02.037 2024] [1130] [BUDDY] Loaded plugins:
  core: empty-string, backup, emulate-elastic, insert, alias, select, show, cli-table, plugin, test, insert-mva, create-table
  local:
  extra:

[Mon Jan  8 10:53:02.230 2024] [1135] WARNING: No persistent state found. Bootstraping with default state
[Mon Jan  8 10:53:03.243 2024] [1225] WARNING: Member 1.0 (node_127.0.0.1_a_1135) requested state transfer from '*any*', but it is impossible to select State Transfer donor: Resource temporarily unavailable
[Mon Jan  8 10:53:03.274 2024] [1128] FATAL: 'a' cluster [127.0.0.1:57928], cmd: 7, error: cluster 'a' is not ready, current state is joining
[Mon Jan  8 10:53:03.775 2024] [1132] FATAL: 'a' cluster [127.0.0.1:57946], cmd: 7, error: cluster 'a' is not ready, current state is joining
root@9ade6f37ac5d:/# cat /var/log/manticore/searchd3.log
[Mon Jan  8 10:53:02.083 2024] [1167] watchdog: main process 1168 forked ok
[Mon Jan  8 10:53:02.084 2024] [1168] starting daemon version '6.2.13 ea9081bb3@240108 dev (columnar 2.2.5 1d1e432@231204) (secondary 2.2.5 1d1e432@231204) (knn 2.2.5 1d1e432@231204)' ...
[Mon Jan  8 10:53:02.085 2024] [1168] listening on 127.0.0.1:39312 for sphinx and http(s)
[Mon Jan  8 10:53:02.085 2024] [1168] listening on 127.0.0.1:39306 for mysql
[Mon Jan  8 10:53:02.085 2024] [1168] listening on 127.0.0.1:39308 for sphinx and http(s)
[Mon Jan  8 10:53:02.123 2024] [1173] prereading 0 tables
[Mon Jan  8 10:53:02.123 2024] [1173] preread 0 tables in 0.000 sec
[Mon Jan  8 10:53:02.125 2024] [1168] accepting connections
[Mon Jan  8 10:53:02.153 2024] [1176] [BUDDY] started v2.0.9 '/usr/share/manticore/modules/manticore-buddy/bin/manticore-buddy --listen=http://127.0.0.1:39312  --threads=12' at http://127.0.0.1:33159
[Mon Jan  8 10:53:02.153 2024] [1176] [BUDDY] Loaded plugins:
  core: empty-string, backup, emulate-elastic, insert, alias, select, show, cli-table, plugin, test, insert-mva, create-table
  local:
  extra:

[Mon Jan  8 10:53:02.230 2024] [1173] WARNING: No persistent state found. Bootstraping with default state
[Mon Jan  8 10:53:03.243 2024] [1226] WARNING: Member 1.0 (node_127.0.0.1_a_1135) requested state transfer from '*any*', but it is impossible to select State Transfer donor: Resource temporarily unavailable
[Mon Jan  8 10:53:04.277 2024] [1180] WARNING: '127.0.0.1:29312': remote error: [127.0.0.1] cluster 'a' is not ready, current state is joining

I have modified the script and the issue still persist, the test fails due to some issue and still I can see fatals in log file, that probably we should not see at all due to blocking nature of cluster commands.

Here is the modified script to use.

pgrep -f searchd | xargs kill -9
rm -f /var/log/manticore/searchd*.log
for i in 1 2 3; do
	lib_dir="/var/lib/manticore$i"
	rm -fr $lib_dir
	mkdir -p $lib_dir
	cat > manticore$i.conf <<EOF
searchd {
    listen = 127.0.0.1:${i}9312
    listen = 127.0.0.1:${i}9306:mysql
    listen = 127.0.0.1:${i}9308:http
    log = /var/log/manticore/searchd${i}.log
    query_log = /var/log/manticore/query${i}.log
    pid_file = /var/run/manticore/searchd${i}.pid
    data_dir = $lib_dir
}
EOF
	searchd -c manticore$i.conf
done

mysql -h0 -P19306 -vv -e 'create cluster a'
mysql -h0 -P19306 -vv -e "create table t1; create table t2; create table t3;"
mysql -h0 -P19306 -vv -e "show tables"
mysql -h0 -P29306 -vv -e "join cluster a at '127.0.0.1:19312'" &
mysql -h0 -P19306 -vv -e "alter cluster a add t1" &
mysql -h0 -P39306 -vv -e "join cluster a at '127.0.0.1:19312'" &
mysql -h0 -P19306 -vv -e "alter cluster a add t2" &
mysql -h0 -P19306 -vv -e "alter cluster a add t3" &
wait 1>/dev/null 2>&1; echo $?
mysql -h0 -P19306 -vv -e "show status like '%_status'"
mysql -h0 -P19306 -vv -e "show status like '%_node_state'"
mysql -h0 -P19306 -vv -e "show status like '%_local_state_comment'"
mysql -h0 -P29306 -vv -e "show status like '%_status'"
mysql -h0 -P29306 -vv -e "show status like '%_node_state'"
mysql -h0 -P29306 -vv -e "show status like '%_local_state_comment'"
mysql -h0 -P39306 -vv -e "show status like '%_status'"
mysql -h0 -P39306 -vv -e "show status like '%_node_state'"
mysql -h0 -P39306 -vv -e "show status like '%_local_state_comment'"

cat /var/log/manticore/searchd*log | grep FATAL

Jan 08 '24 10:01 donhardman

I'm going to rename FATAL at the replication into WARNING that will have the same message with the error description but not the FATAL verbosity

Jan 10 '24 09:01 tomatolog

ok I see that multiple join breaks cluster to update nodes well - investigating the fix

Jan 10 '24 10:01 tomatolog

should be fixed at the current master https://github.com/manticoresoftware/manticoresearch/commit/a2af06ca36abcbf893e43c8803853130b397c54b. I fixed the warning and failure at the multiple nodes join the cluster at the same time

Jan 10 '24 14:01 tomatolog

I attempted to use the most recent version of the test kit to run some tests, but it seems like it's still failing—not all the time, but quite frequently.

I've also noticed that it's now showing a different error. It looks like it can't connect to execute the MySQL query, which could mean the daemon is crashing.

Jan 15 '24 08:01 donhardman

tried to issue clt test at the dev box at the manticore/test/clt-tests but it failed as there is no clt folder or executable

stas@dev2:~/manticore/test/clt-tests$ clt/clt test -d -t test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rec ghcr.io/manticoresoftware/manticoresearch:test-kit-latest; echo $?
-bash: clt/clt: No such file or directory
127

need instruction of how to run the test at the dev box and how to run it with properd daemon binary

Jan 15 '24 14:01 tomatolog

there is no clt folder or executable

Have you tried https://github.com/manticoresoftware/clt?tab=readme-ov-file#installation ?

Jan 15 '24 14:01 sanikolaev

no I will try it now

Jan 15 '24 14:01 tomatolog

after following instructions clt I can run clt well. However it is still unclear where located create-table-5-cluster-60-shards.rec

Master branch does not have that test

stas@dev2:~/build/clt$ ./clt test -d -t ~/manticore/test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rec ghcr.io/manticoresoftware/manticoresearch:test-kit-latest; echo $?
The record file does not exist: /home/stas/manticore/test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rec
1

there are a lot of different clt and test branches not sure what should I checkout.

It is also not clear how to change daemon cli during the test run and how to use my own build daemon to investigate issue.

Jan 15 '24 15:01 tomatolog

However it is still unclear where located create-table-5-cluster-60-shards.rec

https://github.com/manticoresoftware/manticoresearch/blob/tests/sharding/test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rec

Jan 15 '24 15:01 sanikolaev

It's in the tests/sharding branch as was mentioned in this comment before: https://github.com/manticoresoftware/manticoresearch/issues/1485#issuecomment-1876757281

Jan 16 '24 05:01 donhardman

downloaded https://github.com/manticoresoftware/manticoresearch/blob/tests/sharding/test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rec into manticore\test\clt-tests\sharding\replication and still see the error about missed rec file

stas@dev2:~/build/clt$ ./clt test -d -t ~/manticore/test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rec ghcr.io/manticoresoftware/manticoresearch:test-kit-latest; echo $?
Replaying data from the file: /home/stas/manticore/test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rec
The replay result will be stored to the file: /home/stas/manticore/test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rep
rec: No such file or directory (os error 2)
1

however the file exists

stas@dev2:~/build/clt$ ls -lh /home/stas/manticore/test/clt-tests/sharding/replication/                                                                total 28K
-rw-rw-r-- 1 stas stas  12K Jan 16 08:57 create-table-5-cluster-60-shards.rec
-rw-rw-r-- 1 stas stas    2 Jan 16 08:59 create-table-5-cluster-60-shards.rep
-rw-rw-r-- 1 stas stas  584 Jan  9 10:58 init.recb
-rw-rw-r-- 1 stas stas 7.8K Jan  9 10:58 replication-test.rec

Jan 16 '24 09:01 tomatolog

We need to clone the whole branch, because it requires extra files to run that (reusable blocks). Try to checkout the tests/sharding branch and run the test with CLT same way

Jan 17 '24 11:01 donhardman

switched into tests/sharding branch at the ~/manticore/

stas@dev2:~/manticore$ git rev-parse --abbrev-ref HEAD
sharding
stas@dev2:~/manticore$ git log -n1
commit a993c920fefe8beb2dd4011b69c3950ee7097909 (HEAD -> sharding)
Author: Don Hardman <[email protected]>
Date:   Mon Jan 15 14:05:06 2024 +0700

    Fix regex to validate versions with KNN included

then go into clt folder that I checked out run there command ./clt test -d -t ~/manticore/test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rec ghcr.io/manticoresoftware/manticoresearch:test-kit-latest; echo $? and got the same error

stas@dev2:~/build/clt$ ./clt test -d -t ~/manticore/test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rec ghcr.io/manticoresoftware/manticoresearch:test-kit-latest; echo $?
Replaying data from the file: /home/stas/manticore/test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rec
The replay result will be stored to the file: /home/stas/manticore/test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rep
rec: No such file or directory (os error 2)
1

Jan 18 '24 09:01 tomatolog

also tried to checkout clt into manticoresearch and use it from there

stas@dev2:~/manticore$ git clone https://github.com/manticoresoftware/clt.git
Cloning into 'clt'...

use also full paths but it returns another error

stas@dev2:~/manticore$ clt/clt test -d -t /home/stas/manticore/test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rec ghcr.io/manticoresoftware/manticoresearch:test-kit-latest; echo $?
Replaying data from the file: /home/stas/manticore/test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rec
The replay result will be stored to the file: /home/stas/manticore/test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rep
Directory with tests does not exist: tests
1

Jan 18 '24 09:01 tomatolog

The test should be run exactly with the command that is specified in the body of this task. We should run it from the manticoresearch directory. Detailed instruction to run from the scratch:

git clone https://github.com/manticoresoftware/manticoresearch.git
cd manticoresearch
git checkout tests/sharding
git clone https://github.com/manticoresoftware/clt.git
clt/clt test -d -t test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rec ghcr.io/manticoresoftware/manticoresearch:test-kit-latest; echo $?

Jan 18 '24 09:01 donhardman

the initial error was that buddy create multiple clusters and one of the cluster has duplicated name and path or maybe buddy try to retry and create the same cluster then the same cluster operation is on the way. That was fixed at master - now daemon handles every create \ delete \ alter cluster operations one by one.

However test create-table-5-cluster-60-shards.rec still produces error

create table c:t(id bigint) shards=60 rf=3 # error=Waiting timeout exceeded.

Not quite sure how investigate this further as I tried:

setting different timeouts at the searchd that does not help

I also do not see any errors related to cluster at the daemon log.

Jan 23 '24 16:01 tomatolog

I've resolved the major issue with the Buddy race condition that sometimes caused duplicate queries. However, I've noticed there's still a problem with some queries when we run tests now.

/var/log/manticore-1/searchd.log:[Mon Feb  5 17:50:27.376 2024] [471] [BUDDY] [127.0.0.1:19312] Queue query result [error]: {"error":"error at ToExecuteStart, code 3 (transaction aborted, server can continue), seqno -1"}

This issue occurs quite frequently, leading to sharding failures due to the error.

When validating the issue we can grep by "Queue" word to check the full log while Buddy run in debug mode, the script is the same:

git clone https://github.com/manticoresoftware/manticoresearch.git
cd manticoresearch
git checkout tests/sharding
git clone https://github.com/manticoresoftware/clt.git
clt/clt test -d -t test/clt-tests/sharding/replication/create-table-5-cluster-60-shards.rec ghcr.io/manticoresoftware/manticoresearch:test-kit-latest; echo $?

Feb 05 '24 17:02 donhardman

blocked by https://github.com/manticoresoftware/manticoresearch/issues/1802

Feb 06 '24 16:02 tomatolog

should be fixed at https://github.com/manticoresoftware/manticoresearch/commit/b5fee2128154e506089e8d1e2200fe5038088785 need to make sure the clt test uses the last daemon binary

Feb 15 '24 17:02 tomatolog

The issue continues and has become more frequent. Now, every time I start a new run, the same error happens.

––– input –––
mysql -h0 -P19306 -e "create table c:t(id bigint) shards=60 rf=3"
––– output –––
+ ERROR 1064 (42000) at line 1: Waiting timeout exceeded.
––– input –––
grep '\[error\]' /var/log/manticore-{1,2,3,4,5}/searchd.log
––– output –––
+ /var/log/manticore-1/searchd.log:[Mon Feb 19 03:44:09.335 2024] [467] [BUDDY] [127.0.0.1:19312] Queue query result [error]: {"error":"error at ToExecuteStart, code 3 (transaction aborted, server can continue), seqno -1"}

Is this a common error that we can handle, or is there still something wrong with concurrency?

Version used:

starting daemon version '6.2.13 56aaf1f55@24021713 dev (columnar 2.2.5 8c90c1f@240217) (secondary 2.2.5 8c90c1f@240217) (knn 2.2.5 8c90c1f@240217)' ...

Feb 19 '24 03:02 donhardman

should be fixed at https://github.com/manticoresoftware/manticoresearch/commit/da1d7dc198b8bff80ff62c020ebd0130d941a605 need to make sure the clt test uses the last daemon binary and after the fix I do not see errors any more. However test itself shows the differences

Feb 23 '24 15:02 tomatolog

manticoresearch manticoresearch copied to clipboard

Replication issue while sharding with Buddy

manticoresearch
manticoresearch copied to clipboard