rebalance never starts
With citus 11.3 I've added a node and triggered a rebalance. The rebalance has been scheduled correctly but never starts running despite having 1 runnable task (and 10 blocked ones).
I'm using the docker image citusdata/citus:11.3 in all nodes. The connection between the nodes works (primary is at 10.132.0.2):
SELECT * FROM citus_get_active_worker_nodes();
node_name | node_port
------------+-----------
10.132.0.4 | 5432
10.132.0.5 | 5432
(2 rows)
Command history:
staging=# SELECT * from citus_add_node('10.132.0.5', 5432);
citus_add_node
----------------
10
(1 row)
Time: 623.522 ms
staging=# SELECT citus_rebalance_start();
NOTICE: Scheduled 10 moves as job 1
DETAIL: Rebalance scheduled as background job
HINT: To monitor progress, run: SELECT * FROM citus_rebalance_status();
citus_rebalance_start
-----------------------
1
(1 row)
Time: 26.101 ms
staging=# SELECT * FROM citus_rebalance_status();
job_id | state | job_type | description | started_at | finished_at | details
--------+-----------+-----------+---------------------------------+------------+-------------+--------------------------------------------------------------------
1 | scheduled | rebalance | Rebalance all colocation groups | | | {"tasks": [], "task_state_counts": {"blocked": 10, "runnable": 1}}
(1 row)
Time: 3.200 ms
staging=# SELECT pg_terminate_backend(pg_stat_activity.pid)
FROM pg_stat_activity
WHERE pg_stat_activity.datname = 'staging'
AND pid <> pg_backend_pid();
pg_terminate_backend
----------------------
t
t
t
t
t
t
t
t
(8 rows)
staging=# SELECT get_rebalance_table_shards_plan();
get_rebalance_table_shards_plan
-------------------------------------------------------------
(sensor_datapoint,102183,0,10.132.0.4,5432,10.132.0.5,5432)
(sensor_datapoint,102182,0,10.132.0.2,5432,10.132.0.5,5432)
(sensor_datapoint,102185,0,10.132.0.4,5432,10.132.0.5,5432)
(sensor_datapoint,102184,0,10.132.0.2,5432,10.132.0.5,5432)
(sensor_datapoint,102187,0,10.132.0.4,5432,10.132.0.5,5432)
(sensor_datapoint,102186,0,10.132.0.2,5432,10.132.0.5,5432)
(sensor_datapoint,102189,0,10.132.0.4,5432,10.132.0.5,5432)
(sensor_datapoint,102188,0,10.132.0.2,5432,10.132.0.5,5432)
(sensor_datapoint,102191,0,10.132.0.4,5432,10.132.0.5,5432)
(sensor_datapoint,102190,0,10.132.0.2,5432,10.132.0.5,5432)
(10 rows)
Time: 4.475 ms
staging=# SELECT * from pg_dist_node;
nodeid | groupid | nodename | nodeport | noderack | hasmetadata | isactive | noderole | nodecluster | metadatasynced | shouldhaveshards
--------+---------+------------+----------+----------+-------------+----------+----------+-------------+----------------+------------------
1 | 0 | 10.132.0.2 | 5432 | default | t | t | primary | default | t | t
6 | 5 | 10.132.0.4 | 5432 | default | t | t | primary | default | t | t
10 | 9 | 10.132.0.5 | 5432 | default | t | t | primary | default | t | t
staging=# ALTER SYSTEM SET citus.max_background_task_executors_per_node = 2;
ALTER SYSTEM
Time: 9.613 ms
staging=# SELECT pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)
Time: 1.585 ms
staging=# SELECT * FROM citus_rebalance_status() \gx
-[ RECORD 1 ]-------------------------------------------------------------------
job_id | 1
state | scheduled
job_type | rebalance
description | Rebalance all colocation groups
started_at |
finished_at |
details | {"tasks": [], "task_state_counts": {"blocked": 10, "runnable": 1}}
Time: 3.033 ms
I've been waiting for a long time and nothing changes.
Did you check the progress reporting? https://www.citusdata.com/updates/v11-2/#rebalance_lsn
Did you check the progress reporting? https://www.citusdata.com/updates/v11-2/#rebalance_lsn
Nothing there:
staging=# SELECT * FROM get_rebalance_progress();
sessionid | table_name | shardid | shard_size | sourcename | sourceport | targetname | targetport | progress | source_shard_size | target_shard_size | operation_type | source_lsn | target_lsn | status
-----------+------------+---------+------------+------------+------------+------------+------------+----------+-------------------+-------------------+----------------+------------+------------+--------
(0 rows)
Time: 0.507 ms
staging=# SELECT pg_terminate_backend(pg_stat_activity.pid) FROM pg_stat_activity WHERE pg_stat_activity.datname = 'staging' AND pid <> pg_backend_pid(); pg_terminate_backend
btw, terminating the backends also terminates the background rebalance tasks/jobs as they are also running in the same database.
Normally, if a rebalance task fails, Citus retries. But maybe for terminate, Citus does not restart?
Also, maybe check the logs to see if something is preventing the rebalancer to start/progress? There might be ERRORs
staging=# SELECT pg_terminate_backend(pg_stat_activity.pid) FROM pg_stat_activity WHERE pg_stat_activity.datname = 'staging' AND pid <> pg_backend_pid(); pg_terminate_backend
btw, terminating the backends also terminates the background rebalance tasks/jobs as they are also running in the same database.
Normally, if a rebalance task fails, Citus retries. But maybe for terminate, Citus does not restart?
Also, maybe check the logs to see if something is preventing the rebalancer to start/progress? There might be ERRORs
First of all, thanks for your help. Here are some suspicious logs:
2023-08-04 05:52:07.924 UTC [315203] WARNING: unable to start background worker for background task execution
2023-08-04 05:52:07.924 UTC [315203] DETAIL: Current number of task executors: 0/4
2023-08-04 05:52:07.924 UTC [315203] CONTEXT: Citus Background Task Queue Monitor: staging
unable to start background worker for background task execution
Found the error message defined here: https://webcache.googleusercontent.com/search?q=cache:qHGF3I05MvAJ:https://github.com/citusdata/citus/blob/master/src/backend/distributed/utils/maintenanced.c&cd=9&hl=es&ct=clnk&gl=es
Can you try increasing max_worker_processes in your postgresql.conf and restart postgres? The error suggests that it cannot start any. If that indeed helps, we should add a hint to that error to suggest that approach.
Can you try increasing
max_worker_processesin your postgresql.conf and restart postgres? The error suggests that it cannot start any. If that indeed helps, we should add a hint to that error to suggest that approach.
That worked! Now there's an error in the rebalance status output:
ERROR: ERROR: logical decoding
requires wal_level >= logical\nCONTEXT: while executing command on 10.132.0.2:5432\nwhile executing command on localhost:5432\nCitus Background Task Queue Executor: staging/postgres for (1/1)\n
Fixed by adding the following line to the posgresql.conf of all nodes:
# include enough information in WAL for logical decoding
wal_level = logical
# we need at least one replication slot, but leave headroom
max_replication_slots = 10
# enable Citus' logical decoding improvements for dist tables
citus.enable_change_data_capture = 'on'
As explained in https://docs.citusdata.com/en/stable/develop/integrations.html#logical-replication-of-distributed-tables-to-postgresql-tables
I would have not expected to have to do this configuration changes manually, given that I'm using the oficial Docker image for citus.
Thanks a lot for your help and the amazing project!
Okay. Seems like two fixes are in order then on our side:
- Add HINT to error to increase
max_worker_processes - Configure settings to make the rebalancer (and probably CDC) work by default to our docker image.
ccnsapp=# SELECT * FROM get_rebalance_progress();
sessionid | table_name | shardid | shard_size | sourcename | sourceport | targetname | targetport | progress | source_shard_size | target_shard_size | operation_type | source_lsn | target_lsn | status
-----------+------------------------------------+---------+------------+------------+------------+-------------+------------+----------+-------------------+-------------------+----------------+--------------+------------+------------
129334 | companies | 102026 | 212992 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 212992 | 8192 | move | 8FE/60D49CD0 | | Setting Up
129334 | agents | 102058 | 36380672 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 36380672 | 8192 | move | 8FE/60D49CD0 | | Setting Up
129334 | assets | 102090 | 196091904 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 196091904 | 8192 | move | 8FE/60D49CD0 | | Setting Up
129334 | deprecation_status | 102122 | 32768 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 32768 | 0 | move | 8FE/60D49CD0 | | Setting Up
129334 | windows_specifics | 102154 | 32768 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 32768 | 0 | move | 8FE/60D49CD0 | | Setting Up
129334 | mac_specifics | 102186 | 40960 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 40960 | 8192 | move | 8FE/60D49CD0 | | Setting Up
its says setting up... when it will start while its active and progrees.. why it takes more time?
@shankarmn94 did you take a look at the postgres logs for some hints?
Also did you try increasing max_worker_processes, because that helped for @senscarlos
@shankarmn94 did you take a look at the postgres logs for some hints?
Also did you try increasing max_worker_processes, because that helped for @senscarlos
@JelteF @onderkalaci
yes i have incresed to 4 and inititated still same... that was a problem with the node and where node was failing to transfer, network issue.. now the select queries works for existing..
we forcefull removed those shardid in cluster and removed the worker node which was giving issue..
now. while we tried to created distirbuted tables..
ccnsapp=# SELECT create_reference_table('service_ports'); ERROR: table "service_ports" is already distributed ccnsapp=# ccnsapp=# SELECT create_distributed_table('asset_stats', 'tenantid', colocate_with => 'companies'); ERROR: reference table "service_ports" does not have a shard
how can we resolve this..
we forcefull removed those shardid in cluster
That's really not recommended. manually removing shards is bound to cause all sorts of issues, including data loss.
If there is no important data in the service_ports table, then I think the easiest way "solve" your current problem is by dropping the service_ports table.