docker
docker copied to clipboard
Worker node not being added
docker-compose.yml
version: "2.1"
services:
master:
build: ./docker/postgres
image: 'citus:local'
container_name: "${COMPOSE_PROJECT_NAME:-citus}_master"
labels: ['com.citusdata.role=Master']
restart: unless-stopped
ports: ["${MASTER_EXTERNAL_PORT:-5432}:5432"]
volumes:
- ./docker/postgres/data/master:/var/lib/postgresql/data
env_file:
- ./.env
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
interval: 10s
timeout: 5s
retries: 5
worker:
build: ./docker/postgres
image: 'citus:local'
labels: ['com.citusdata.role=Worker']
restart: unless-stopped
depends_on: { manager: { condition: service_healthy } }
volumes:
- ./docker/postgres/data/worker:/var/lib/postgresql/data
env_file:
- ./.env
manager:
image: 'citusdata/membership-manager:0.2.0'
container_name: "${COMPOSE_PROJECT_NAME:-citus}_manager"
volumes: ['/var/run/docker.sock:/var/run/docker.sock']
depends_on: { master: { condition: service_healthy } }
restart: unless-stopped
env_file:
- ./.env
Everything seems to work, except no active nodes are returned when running SELECT master_get_active_worker_nodes();
This happened to me; master
service was taking time to startup and manager
service would fail to connect (from docker-compose logs, master server restarts multiple times due to configuration changes). Setting manager
service to restart: on-failure
fixed this. Manager is able to connect after 2,3 tries.
And now SELECT master_get_active_worker_nodes();
returns a worker node too.
I created two PRs that aims to resolve this issue:
- https://github.com/citusdata/membership-manager/pull/9 implements a polling mechanism in Membership Manager, so that (a) it will detect the readiness of the coordinator node, (b) properly report that it is ready to accept new Citus worker services
- https://github.com/citusdata/docker/pull/187 (a) updates the docker images to properly detect dependencies, (b) introduce Compose V3 definitions
Once they are merged, this issue should be resolved
I am currently using similar method to wait for db. i.e. using pg_healthcheck
. In addition to that I was also waiting for the worker node to be added to manager
before I run any schema scripts. But now I think this is redundant. Can you confirm if the a new worker joins the master it will be automatically updated (schema and data) by the master
?
^ got the answer. Need to wait for worker nodes to run reference
or distributed
table definitions. This means it would be a better idea to wait for at-least one worker node too?
Currently I am using this script to check:
# wait for worker nodes to be added to citus membership manager
while [ 0 == $(psql --username postgres --dbname ${POSTGRES_DB} --tuples-only --command "SELECT count(*) from master_get_active_worker_nodes();") ]; do
sleep 3s
done
is there a better way?
I think it is better to wait until all your worker nodes are registered, and ready to accept connections. If you distribute a table when only one worker node is active, your queries may be slower than expected due to the uneven distribution of your data
Ok Cool. Thanks
Any future nodes will be added by membership-manager
but we need to run following command to redistribute data for optimal performance.
SELECT rebalance_table_shards();
I want to remind you that shard rebalancing is an enterprise feature and is not available in the docker setup.
Well, surprised, did not know that.
You can see https://www.citusdata.com/product/comparison for a comparison of features between Citus community, Citus enterprise and Citus on Azure
I want to remind you that shard rebalancing is an enterprise feature and is not available in the docker setup.
Hi! Could I rebalance shards manually for playing with docker images without a citus enterprise license?