docker icon indicating copy to clipboard operation
docker copied to clipboard

persist data with a citus cluster

Open sentient opened this issue 6 years ago • 13 comments

I'm pretty new to Citus so be easy on me ;)

The documentation mentions that the citus_standalone is storing the PGDATA in a mounted volume (,so it persist between restarts of the container)

I am trying to start the citus cluster with data persistency. Do I have to do something for this?

First of all I had to change the docker-compose.yml file a little bit. I'm running docker-compose version 1.17.0, build ac53b73

version: '2.3'

services:

  master:
    container_name: "${COMPOSE_PROJECT_NAME:-citus}_master"
    image: citusdata/citus:7.1.1
    ports:
      - "5432:5432"
    labels:
      com.citusdata.role: Master
    
  worker:
    container_name: "${COMPOSE_PROJECT_NAME:-citus}_worker"
    image: citusdata/citus:7.1.1
    labels:
      com.citusdata.role: Worker
    depends_on: { manager: { condition: service_healthy } }
  manager:
    container_name: "${COMPOSE_PROJECT_NAME:-citus}_manager"
    image: 'citusdata/membership-manager:0.2.0'
    volumes: ['/var/run/docker.sock:/var/run/docker.sock']
    depends_on: { master: { condition: service_healthy } }

I created two directories on my host environment to store the pgdata for the Master and the Worker The docker-compose files looks now like:

version: '2.3'

services:
  master:
    container_name: "${COMPOSE_PROJECT_NAME:-citus}_master"
    image: citusdata/citus:7.1.1
    ports:
      - "5432:5432"
    labels:
      com.citusdata.role: Master
    volumes:
      - /media/marco/DataDisk/citus-master:/var/lib/postgresql/data

  worker:
    container_name: "${COMPOSE_PROJECT_NAME:-citus}_worker"
    image: citusdata/citus:7.1.1
    labels:
      com.citusdata.role: Worker
    volumes:
      - /media/marco/DataDisk/citus-worker:/var/lib/postgresql/data
    depends_on: { manager: { condition: service_healthy } }
  manager:
    container_name: "${COMPOSE_PROJECT_NAME:-citus}_manager"
    image: 'citusdata/membership-manager:0.2.0'
    volumes: ['/var/run/docker.sock:/var/run/docker.sock']
    depends_on: { master: { condition: service_healthy } }

docker-compose up works fine. And I can create a database etc.

However after a docker-compose down and then a new start docker-compose up , it has the tables but it cannot find the shards.

ERRO[2017-12-11T04:31:45Z] pq: could not find length of shard 102072     req_id=ZXvdeL1Mh2-1

I'm pretty new here. So not sure what is going on?

  • Can I define my own volumes?
  • Should I do an 'up' and 'down' or only 'start' 'stop'
  • Where is the shared information persisted?

Thanks for any help

Marco

sentient avatar Dec 11 '17 17:12 sentient

some additional info from the docker-compose logs

... 
                    PostgreSQL init process complete; ready for start up.
master_1   | BEGIN
worker_1   | 
worker_1   | 2017-12-11 04:47:47.366 UTC [1] WARNING:  citus.enable_deadlock_prevention is deprecated and it has no effect. The flag will be removed in the next release.
worker_1   | 2017-12-11 04:47:47.369 UTC [1] LOG:  number of prepared transactions has not been configured, overriding
master_1   | 2017-12-11 04:47:40.775 UTC [68] LOG:  starting maintenance daemon on database 12994 user 10
worker_1   | 2017-12-11 04:47:47.369 UTC [1] DETAIL:  max_prepared_transactions is now set to 200
worker_1   | 2017-12-11 04:47:47.369 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
worker_1   | 2017-12-11 04:47:47.370 UTC [1] LOG:  listening on IPv6 address "::", port 5432
worker_1   | 2017-12-11 04:47:47.375 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
worker_1   | 2017-12-11 04:47:47.405 UTC [68] LOG:  database system was shut down at 2017-12-11 04:47:47 UTC
worker_1   | 2017-12-11 04:47:47.415 UTC [1] LOG:  database system is ready to accept connections
worker_1   | 2017-12-11 04:49:45.329 UTC [456] LOG:  starting maintenance daemon on database 12994 user 10
worker_1   | 2017-12-11 04:49:45.329 UTC [456] CONTEXT:  Citus maintenance daemon for database 12994 user 10
worker_1   | 2017-12-11 18:00:54.298 UTC [1] WARNING:  citus.enable_deadlock_prevention is deprecated and it has no effect. The flag will be removed in the next release.
worker_1   | 2017-12-11 18:00:54.301 UTC [1] LOG:  number of prepared transactions has not been configured, overriding
worker_1   | 2017-12-11 18:00:54.301 UTC [1] DETAIL:  max_prepared_transactions is now set to 200
worker_1   | 2017-12-11 18:00:54.301 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
worker_1   | 2017-12-11 18:00:54.301 UTC [1] LOG:  listening on IPv6 address "::", port 5432
worker_1   | 2017-12-11 18:00:54.316 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
worker_1   | 2017-12-11 18:00:54.344 UTC [24] LOG:  database system was interrupted; last known up at 2017-12-11 17:33:24 UTC
worker_1   | 2017-12-11 18:00:54.803 UTC [24] LOG:  database system was not properly shut down; automatic recovery in progress
master_1   | 2017-12-11 04:47:40.775 UTC [68] CONTEXT:  Citus maintenance daemon for database 12994 user 10
worker_1   | 2017-12-11 18:00:54.819 UTC [24] LOG:  redo starts at 0/19EFCA0
master_1   | CREATE EXTENSION
master_1   | UPDATE 1
worker_1   | 2017-12-11 18:00:54.819 UTC [24] LOG:  invalid record length at 0/19EFD80: wanted 24, got 0
master_1   | COMMIT
master_1   | 
worker_1   | 2017-12-11 18:00:54.819 UTC [24] LOG:  redo done at 0/19EFD48
master_1   | 
master_1   | 2017-12-11 04:47:41.165 UTC [41] LOG:  received fast shutdown request
master_1   | waiting for server to shut down....2017-12-11 04:47:41.168 UTC [41] LOG:  aborting any active transactions
master_1   | 2017-12-11 04:47:41.169 UTC [41] LOG:  worker process: logical replication launcher (PID 49) exited with exit code 1
master_1   | 2017-12-11 04:47:41.169 UTC [68] FATAL:  terminating connection due to administrator command
master_1   | 2017-12-11 04:47:41.169 UTC [68] CONTEXT:  Citus maintenance daemon for database 12994 user 10
master_1   | 2017-12-11 04:47:41.171 UTC [41] LOG:  worker process: Citus Maintenance Daemon: 12994/10 (PID 68) exited with exit code 1
master_1   | 2017-12-11 04:47:41.171 UTC [43] LOG:  shutting down
master_1   | 2017-12-11 04:47:41.262 UTC [41] LOG:  database system is shut down
master_1   |  done
master_1   | server stopped
worker_1   | 2017-12-11 18:00:54.846 UTC [1] LOG:  database system is ready to accept connections
master_1   | 
master_1   | PostgreSQL init process complete; ready for start up.
master_1   | 
worker_1   | 2017-12-11 18:57:57.889 UTC [10639] LOG:  starting maintenance daemon on database 12994 user 10
worker_1   | 2017-12-11 18:57:57.889 UTC [10639] CONTEXT:  Citus maintenance daemon for database 12994 user 10
master_1   | 2017-12-11 04:47:41.288 UTC [1] WARNING:  citus.enable_deadlock_prevention is deprecated and it has no effect. The flag will be removed in the next release.
master_1   | 2017-12-11 04:47:41.290 UTC [1] LOG:  number of prepared transactions has not been configured, overriding
master_1   | 2017-12-11 04:47:41.290 UTC [1] DETAIL:  max_prepared_transactions is now set to 200
master_1   | 2017-12-11 04:47:41.291 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
worker_1   | 2017-12-11 19:53:44.820 UTC [1] LOG:  received smart shutdown request
worker_1   | 2017-12-11 19:53:44.825 UTC [10639] FATAL:  terminating connection due to administrator command
worker_1   | 2017-12-11 19:53:44.825 UTC [10639] CONTEXT:  Citus maintenance daemon for database 12994 user 10
worker_1   | 2017-12-11 19:53:44.826 UTC [1] LOG:  worker process: logical replication launcher (PID 31) exited with exit code 1
master_1   | 2017-12-11 04:47:41.291 UTC [1] LOG:  listening on IPv6 address "::", port 5432
worker_1   | 2017-12-11 19:53:44.827 UTC [1] LOG:  worker process: Citus Maintenance Daemon: 12994/10 (PID 10639) exited with exit code 1
worker_1   | 2017-12-11 19:53:44.827 UTC [25] LOG:  shutting down
master_1   | 2017-12-11 04:47:41.294 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
master_1   | 2017-12-11 04:47:41.314 UTC [70] LOG:  database system was shut down at 2017-12-11 04:47:41 UTC
master_1   | 2017-12-11 04:47:41.326 UTC [1] LOG:  database system is ready to accept connections
master_1   | 2017-12-11 04:47:49.253 UTC [101] LOG:  starting maintenance daemon on database 12994 user 10
master_1   | 2017-12-11 04:47:49.253 UTC [101] CONTEXT:  Citus maintenance daemon for database 12994 user 10
master_1   | 2017-12-11 18:00:48.522 UTC [1] WARNING:  citus.enable_deadlock_prevention is deprecated and it has no effect. The flag will be removed in the next release.
master_1   | 2017-12-11 18:00:48.525 UTC [1] LOG:  number of prepared transactions has not been configured, overriding
master_1   | 2017-12-11 18:00:48.525 UTC [1] DETAIL:  max_prepared_transactions is now set to 200
master_1   | 2017-12-11 18:00:48.525 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
master_1   | 2017-12-11 18:00:48.525 UTC [1] LOG:  listening on IPv6 address "::", port 5432
worker_1   | 2017-12-11 19:53:44.854 UTC [1] LOG:  database system is shut down
worker_1   | 2017-12-11 20:04:37.265 UTC [1] WARNING:  citus.enable_deadlock_prevention is deprecated and it has no effect. The flag will be removed in the next release.
master_1   | 2017-12-11 18:00:48.529 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
master_1   | 2017-12-11 18:00:48.549 UTC [23] LOG:  database system was interrupted; last known up at 2017-12-11 17:33:03 UTC
master_1   | 2017-12-11 18:00:48.857 UTC [23] LOG:  database system was not properly shut down; automatic recovery in progress
master_1   | 2017-12-11 18:00:48.868 UTC [23] LOG:  redo starts at 0/17D9D38
master_1   | 2017-12-11 18:00:48.868 UTC [23] LOG:  invalid record length at 0/17D9E18: wanted 24, got 0
worker_1   | 2017-12-11 20:04:37.269 UTC [1] LOG:  number of prepared transactions has not been configured, overriding
worker_1   | 2017-12-11 20:04:37.269 UTC [1] DETAIL:  max_prepared_transactions is now set to 200
worker_1   | 2017-12-11 20:04:37.269 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
master_1   | 2017-12-11 18:00:48.868 UTC [23] LOG:  redo done at 0/17D9DE0
master_1   | 2017-12-11 18:00:48.898 UTC [1] LOG:  database system is ready to accept connections
master_1   | 2017-12-11 18:00:58.314 UTC [53] LOG:  starting maintenance daemon on database 12994 user 10
master_1   | 2017-12-11 18:00:58.314 UTC [53] CONTEXT:  Citus maintenance daemon for database 12994 user 10
master_1   | 2017-12-11 18:19:48.622 UTC [3019] FATAL:  role "root" does not exist
master_1   | 2017-12-11 19:41:42.747 UTC [15816] LOG:  incomplete startup packet
master_1   | 2017-12-11 19:44:40.192 UTC [16288] LOG:  incomplete startup packet
master_1   | 2017-12-11 19:47:14.874 UTC [16692] LOG:  incomplete startup packet
worker_1   | 2017-12-11 20:04:37.269 UTC [1] LOG:  listening on IPv6 address "::", port 5432
master_1   | 2017-12-11 19:53:45.429 UTC [1] LOG:  received smart shutdown request
worker_1   | 2017-12-11 20:04:37.274 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
worker_1   | 2017-12-11 20:04:37.293 UTC [24] LOG:  database system was shut down at 2017-12-11 19:53:44 UTC
worker_1   | 2017-12-11 20:04:37.312 UTC [1] LOG:  database system is ready to accept connections
master_1   | 2017-12-11 19:53:45.433 UTC [53] FATAL:  terminating connection due to administrator command
master_1   | 2017-12-11 19:53:45.433 UTC [53] CONTEXT:  Citus maintenance daemon for database 12994 user 10
master_1   | 2017-12-11 19:53:45.434 UTC [1] LOG:  worker process: logical replication launcher (PID 30) exited with exit code 1
master_1   | 2017-12-11 19:53:45.435 UTC [1] LOG:  worker process: Citus Maintenance Daemon: 12994/10 (PID 53) exited with exit code 1
master_1   | 2017-12-11 19:53:45.435 UTC [24] LOG:  shutting down
master_1   | 2017-12-11 19:53:45.475 UTC [1] LOG:  database system is shut down
master_1   | 2017-12-11 20:04:31.283 UTC [1] WARNING:  citus.enable_deadlock_prevention is deprecated and it has no effect. The flag will be removed in the next release.
master_1   | 2017-12-11 20:04:31.287 UTC [1] LOG:  number of prepared transactions has not been configured, overriding
master_1   | 2017-12-11 20:04:31.287 UTC [1] DETAIL:  max_prepared_transactions is now set to 200
master_1   | 2017-12-11 20:04:31.287 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
master_1   | 2017-12-11 20:04:31.287 UTC [1] LOG:  listening on IPv6 address "::", port 5432
master_1   | 2017-12-11 20:04:31.301 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
master_1   | 2017-12-11 20:04:31.320 UTC [24] LOG:  database system was shut down at 2017-12-11 19:53:45 UTC
master_1   | 2017-12-11 20:04:31.387 UTC [1] LOG:  database system is ready to accept connections
master_1   | 2017-12-11 20:04:41.344 UTC [54] LOG:  starting maintenance daemon on database 12994 user 10
master_1   | 2017-12-11 20:04:41.344 UTC [54] CONTEXT:  Citus maintenance daemon for database 12994 user 10
master_1   | 2017-12-11 20:05:41.622 UTC [54] WARNING:  could not find any shard placements for shardId 102008
master_1   | 2017-12-11 20:05:41.622 UTC [54] CONTEXT:  Citus maintenance daemon for database 12994 user 10
master_1   | 2017-12-11 20:05:41.622 UTC [54] WARNING:  could not find any shard placements for shardId 102040
master_1   | 2017-12-11 20:05:41.622 UTC [54] CONTEXT:  Citus maintenance daemon for database 12994 user 10
master_1   | 2017-12-11 20:05:41.622 UTC [54] WARNING:  could not find any shard placements for shardId 102072
master_1   | 2017-12-11 20:05:41.622 UTC [54] CONTEXT:  Citus maintenance daemon for database 12994 user 10
master_1   | 2017-12-11 20:06:35.982 UTC [325] WARNING:  could not find any shard placements for shardId 102072
master_1   | 2017-12-11 20:06:35.982 UTC [325] ERROR:  could not find length of shard 102072
master_1   | 2017-12-11 20:06:35.982 UTC [325] DETAIL:  Could not find any shard placements for the shard.
master_1   | 2017-12-11 20:06:35.982 UTC [325] STATEMENT:  SELECT principal FROM principals WHERE user_id = 'newyu:www'
master_1   | 2017-12-11 20:06:35.993 UTC [325] WARNING:  could not find any shard placements for shardId 102008
master_1   | 2017-12-11 20:06:35.993 UTC [325] ERROR:  cannot perform distributed planning for the given query
master_1   | 2017-12-11 20:06:35.993 UTC [325] DETAIL:  Select query cannot be pushed down to the worker.



sentient avatar Dec 11 '17 20:12 sentient

I spent some time looking at this, and it appears our membership-manager might be the culprit: https://github.com/citusdata/membership-manager/blob/master/manager.py#L31. A stop seems to be triggering the remove_node() method, which deletes data from pg_dist_placement, and calls master_remove_node. This causes the table and contents in pg_dist_partition, pg_dist_shard to remain, but nothing in pg_dist_placement. I'll open up an issue in the membership manager repo to fix this.

sumedhpathak avatar Dec 12 '17 21:12 sumedhpathak

Any update on this issue? Could you share the issue opened within manager repo in order to be able to follow its resolulution? Thank you very much in advance.

TheTechOddBug avatar Feb 13 '18 14:02 TheTechOddBug

I have the same issue in Docker for Windows.

I've tried to set a local path as a volume in manager service

  manager:
    container_name: "${COMPOSE_PROJECT_NAME:-citus}_manager"
    image: 'citusdata/membership-manager:0.2.0'
    volumes: 
      - C:\docker\volumes\citus_manager_volume:/var/run/docker.sock
    depends_on: 
      master:
        condition: service_healthy

But, when I execute docker-compose up -d, I get the follow error:

citus_master is up-to-date
Recreating citus_manager ... done

ERROR: for worker  Container "e974ae128ef7" is unhealthy.
ERROR: Encountered errors while bringing up the project.

This is the log:

connected to master
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 357, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.6/http/client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.6/http/client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.6/http/client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.6/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.6/http/client.py", line 964, in send
    self.connect()
  File "/usr/local/lib/python3.6/site-packages/docker/transport/unixconn.py", line 33, in connect
    sock.connect(self.unix_socket)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/requests/adapters.py", line 440, in send
    timeout=timeout
  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 639, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/local/lib/python3.6/site-packages/urllib3/util/retry.py", line 357, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/local/lib/python3.6/site-packages/urllib3/packages/six.py", line 685, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 357, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.6/http/client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.6/http/client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.6/http/client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.6/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.6/http/client.py", line 964, in send
    self.connect()
  File "/usr/local/lib/python3.6/site-packages/docker/transport/unixconn.py", line 33, in connect
    sock.connect(self.unix_socket)
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionRefusedError(111, 'Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./manager.py", line 97, in <module>
    main()
  File "./manager.py", line 93, in main
    docker_checker()
  File "./manager.py", line 64, in docker_checker
    this_container = client.containers.get(my_hostname)
  File "/usr/local/lib/python3.6/site-packages/docker/models/containers.py", line 778, in get
    resp = self.client.api.inspect_container(container_id)
  File "/usr/local/lib/python3.6/site-packages/docker/utils/decorators.py", line 19, in wrapped
    return f(self, resource_id, *args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/docker/api/container.py", line 756, in inspect_container
    self._get(self._url("/containers/{0}/json", container)), True
  File "/usr/local/lib/python3.6/site-packages/docker/utils/decorators.py", line 46, in inner
    return f(self, *args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/docker/api/client.py", line 189, in _get
    return self.get(url, **self._set_request_timeout(kwargs))
  File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 521, in get
    return self.request('GET', url, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 508, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 618, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/adapters.py", line 490, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionRefusedError(111, 'Connection refused'))

I also tried to set a volume in worker and master services, but, after run docker-compose down, data is missing.

NOTE:

  • Docker version 17.12.0-ce, build c97c6d6
  • docker-compose version 1.18.0, build 8dd22a96

oscuroweb avatar Feb 13 '18 15:02 oscuroweb

i also am running into this issue on linux and macosx

jack0 avatar Mar 22 '18 22:03 jack0

+1 I've had a lot of issues with this, osx

firefox0102 avatar Apr 16 '18 18:04 firefox0102

i got this to work but it's rather hacky... hopefully it will help someone else

my primary goal is to be able to save the citus container data so we don’t need to run the pre-db init scripts that install various extensions as well as migrations here are the steps (assumption is only one worker)

1. copy entire pg_dist_placement table to another table
2. stop the worker using pg_ctl stop -m fast
3. start the worker again
4. stop the worker using pg_ctl stop -m smart
5. docker commit the worker volume
6. use docker-compose up to scale the worker and the citus manager to 0 replicas
7. stop the citus coordinator using pg_ctl stop -m smart
8. docker commit coordinator volume

startup:
1. docker-compose up specifying the saved image volumes for both the worker and coordinator
2. copy over the pg_dist_placement data from the backup
3.  potentially update the pg_dist_placement groupid if the new worker id is different

steps 2-4 are necessary b/c if you just do pg_ctl stop -m fast the worker container will be in a bad terminated state and if you save the resulting volume and try to load it the worker will have to do startup checks to verify data integrity. on the other hand, if you try to do pg_ctl stop -m smart on the worker initially the worker never seems to terminate

jack0 avatar Apr 16 '18 19:04 jack0

This is probably the root cause: https://github.com/citusdata/membership-manager/issues/4 . I've created a branch to address this and will update once it's merged.

colton-citus avatar Sep 13 '18 23:09 colton-citus

Hi,

I am facing similar issue on Kubernetes where i am running a 2 node cluster and one coordinator node. When i scaled down my worker and master statefulset to zero and scaled it back again to two again the cluster is throwing an error stating that tbl_nme_10284 shard is missing.

Could you please confirm how we can resolve this issue as it leads to entire data loss.

Thanks

abhi7788 avatar Apr 14 '20 19:04 abhi7788

Thanks @hanefi for redirecting me to this conversation. I have already posted my issue on this thread. I will add my logs here too so that it will be easier to debug the issue. Also when you have mentioned that citus docker image is missing the data persistance feature, so you really mean that it does not store the data between the restart. I have attached a PVC with the citus workers and Master and mounted the data directory to the PVC therefore my assumption was it can persist the data. Let me know if my understanding is wrong.

Logs of my issue on K8:

2020-04-16 12:27:28.728 UTC [46] LOG: configuration file "/var/lib/postgresql/data/pgdata/postgresql.conf" contains errors; unaffected changes were applied CREATE EXTENSION 2020-04-16 12:27:28.870 UTC [65] LOG: starting maintenance daemon on database 13408 user 10 2020-04-16 12:27:28.870 UTC [65] CONTEXT: Citus maintenance daemon for database 13408 user 10 UPDATE 1 COMMIT 2020-04-16 12:27:28.879 UTC [46] LOG: received fast shutdown request waiting for server to shut down....2020-04-16 12:27:28.880 UTC [46] LOG: aborting any active transactions 2020-04-16 12:27:28.880 UTC [65] FATAL: terminating connection due to administrator command 2020-04-16 12:27:28.880 UTC [65] CONTEXT: Citus maintenance daemon for database 13408 user 10 2020-04-16 12:27:28.882 UTC [46] LOG: background worker "logical replication launcher" (PID 54) exited with exit code 1 2020-04-16 12:27:28.882 UTC [46] LOG: background worker "Citus Maintenance Daemon: 13408/10" (PID 65) exited with exit code 1 2020-04-16 12:27:28.882 UTC [48] LOG: shutting down 2020-04-16 12:27:28.903 UTC [46] LOG: database system is shut down done server stopped PostgreSQL init process complete; ready for start up. 2020-04-16 12:27:29.013 UTC [1] LOG: number of prepared transactions has not been configured, overriding 2020-04-16 12:27:29.013 UTC [1] DETAIL: max_prepared_transactions is now set to 200 2020-04-16 12:27:29.013 UTC [1] LOG: starting PostgreSQL 12.2 (Debian 12.2-2.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit 2020-04-16 12:27:29.014 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2020-04-16 12:27:29.027 UTC [67] LOG: database system was shut down at 2020-04-16 12:27:28 UTC 2020-04-16 12:27:29.033 UTC [1] LOG: database system is ready to accept connections

abhi7788 avatar Apr 16 '20 16:04 abhi7788

I do not have enough experience around K8s, but the main issue is this: After a docker destroy event, we run master_remove_node that basically removes all the sharding metadata that is associated with said worker. So, if you downscale, and then upscale you may end up losing access to some of your shards.

You may need some manual management of adding/removing/activating/deactivating workers, and the functionality in membership-manager may not be enough for your needs.

See https://github.com/citusdata/membership-manager/blob/6514b45844c11e94f2408f137ab2b9b28b4d7eed/manager.py#57

hanefi avatar Apr 17 '20 13:04 hanefi

Thanks @hanefi for your reply. I will look into this to see if there is way i can customize and fix it. But jus to reconfirm is this a known bug with citus docker setup and if yes there is no fix available yet for this. I am little surprised to see this as we were hoping to use citus on Kubernetes for some usecase but this issue seems jus unrealistic for a database that it cannot persist the data at all between restarts. I am not sure if i am missing on something here from the setup perspective.

abhi7788 avatar Apr 17 '20 18:04 abhi7788

# This file is auto generated from it's template,
# see citusdata/tools/packaging_automation/templates/docker/latest/docker-compose.tmpl.yml.
version: "3"

services:
  master:
    image: "citusdata/citus:10.2.3"
    ports: ["${COORDINATOR_EXTERNAL_PORT:-5432}:5432"]
    labels: ["com.citusdata.role=Master"]
    environment: &AUTH
      POSTGRES_USER: "${POSTGRES_USER:-postgres}"
      POSTGRES_PASSWORD: "${POSTGRES_PASSWORD}"
      PGUSER: "${POSTGRES_USER:-postgres}"
      PGPASSWORD: "${POSTGRES_PASSWORD}"
      POSTGRES_HOST_AUTH_METHOD: "${POSTGRES_HOST_AUTH_METHOD:-trust}"
  manager:
    container_name: "${COMPOSE_PROJECT_NAME:-citus}_manager"
    image: "citusdata/membership-manager:0.3.0"
    volumes:
      - "${DOCKER_SOCK:-/var/run/docker.sock}:/var/run/docker.sock"
      - healthcheck-volume:/healthcheck
    depends_on: [master]
    environment: *AUTH
  worker:
    image: "citusdata/citus:10.2.3"
    labels: ["com.citusdata.role=Worker"]
    depends_on: [manager]
    environment: *AUTH
    command: "/wait-for-manager.sh"
    volumes:
      - healthcheck-volume:/healthcheck
volumes:
  healthcheck-volume:

Thank you for citus!

I am very interested in a ha-psql-citus cluster. I am new to it and don't have a clear picture yet on the aspect of HA. Any links &/ input is very much welcomed.

This is what I did on trying to acomplish or test out a HA:

  1. docker-compose up -d
  2. scale the workers to 5 in total
  3. import data -> ok
  4. query data -> ok
  5. scale the workers to 0 in total
  6. query data -> ok
  7. scale the workers to 5 in total
  8. query data -> ok
  9. recreate the master
  10. query data -> nok -> all data gone
  11. add the worker nodes
  12. query data -> nok

As I understood from the documentation, no data will or should be stored on the master, but if the master gets recreated all data is gone. There is no volume mounted to the master in the example above, simulating a VM node crash.

4F2E4A2E avatar Jan 15 '22 11:01 4F2E4A2E