containers
containers copied to clipboard
When the master node starts again, postgresql cannot start normally
Description Master db and slave be deployed on different servers. Both master and slave synchronize data normally. When the primary node shuts down, the standby node switches normally. But when the master node starts again, PostgreSQL cannot start normally
Steps to reproduce the issue:
- Master db [A] and slave [B] be deployed on different servers. Both master and slave synchronize data normally
- A shuts down, B switches to master db
- A starts again, postgresql cannot start
Describe the results you received: log
pam-pgsql-0_1 | postgresql-repmgr 08:43:50.93 INFO ==> ** Starting PostgreSQL with Replication Manager setup **
pam-pgsql-0_1 | postgresql-repmgr 08:43:50.96 INFO ==> Validating settings in REPMGR_* env vars...
pam-pgsql-0_1 | postgresql-repmgr 08:43:50.97 INFO ==> Validating settings in POSTGRESQL_* env vars..
pam-pgsql-0_1 | postgresql-repmgr 08:43:50.98 INFO ==> Querying all partner nodes for common upstream node...
pam-pgsql-0_1 | postgresql-repmgr 08:43:51.05 INFO ==> Auto-detected primary node: '10.47.154.107:5432'
pam-pgsql-0_1 | postgresql-repmgr 08:43:51.06 INFO ==> Preparing PostgreSQL configuration...
pam-pgsql-0_1 | postgresql-repmgr 08:43:51.07 INFO ==> postgresql.conf file not detected. Generating it...
pam-pgsql-0_1 | postgresql-repmgr 08:43:51.13 INFO ==> Preparing repmgr configuration...
pam-pgsql-0_1 | postgresql-repmgr 08:43:51.13 INFO ==> Initializing Repmgr...
pam-pgsql-0_1 | postgresql-repmgr 08:43:51.14 INFO ==> Waiting for primary node...
pam-pgsql-0_1 | postgresql-repmgr 08:43:51.17 INFO ==> Cloning data from primary node...
pam-pgsql-0_1 | postgresql-repmgr 08:43:51.88 INFO ==> Initializing PostgreSQL database...
pam-pgsql-0_1 | postgresql-repmgr 08:43:51.89 INFO ==> Cleaning stale /bitnami/postgresql/data/standby.signal file
pam-pgsql-0_1 | postgresql-repmgr 08:43:51.90 INFO ==> Custom configuration /opt/bitnami/postgresql/conf/postgresql.conf detected
pam-pgsql-0_1 | postgresql-repmgr 08:43:51.90 INFO ==> Custom configuration /opt/bitnami/postgresql/conf/pg_hba.conf detected
pam-pgsql-0_1 | postgresql-repmgr 08:43:51.92 INFO ==> Deploying PostgreSQL with persisted data...
pam-pgsql-0_1 | postgresql-repmgr 08:43:51.94 INFO ==> Configuring replication parameters
pam-pgsql-0_1 | postgresql-repmgr 08:43:51.96 INFO ==> Configuring fsync
pam-pgsql-0_1 | postgresql-repmgr 08:43:51.99 INFO ==> Setting up streaming replication slave...
pam-pgsql-0_1 | postgresql-repmgr 08:43:52.02 INFO ==> Starting PostgreSQL in background...
pam-pgsql-0_1 | postgresql-repmgr 08:43:52.16 INFO ==> Unregistering standby node...
pam-pgsql-0_1 | postgresql-repmgr 08:43:52.28 INFO ==> Registering Standby node...
pam-pgsql-0_1 | postgresql-repmgr 08:43:52.33 INFO ==> Stopping PostgreSQL...
pam-pgsql-0_1 | postgresql-repmgr 08:43:53.35 INFO ==> ** PostgreSQL with Replication Manager setup finished! **
pam-pgsql-0_1 |
pam-pgsql-0_1 | postgresql-repmgr 08:43:53.38 INFO ==> Starting PostgreSQL in background...
pam-pgsql-0_1 | postgresql-repmgr 08:43:53.52 INFO ==> ** Starting repmgrd **
pam-pgsql-0_1 | [2020-10-17 08:43:53] [NOTICE] repmgrd (repmgrd 5.1.0) starting up
pam-pgsql-0_1 | [2020-10-17 08:43:53] [ERROR] PID file "/opt/bitnami/repmgr/tmp/repmgr.pid" exists and seems to contain a valid PID
pam-pgsql-0_1 | [2020-10-17 08:43:53] [HINT] if repmgrd is no longer alive, remove the file and restart repmgrd
pam-pgsql-0_1 | postgresql-repmgr 08:43:55.70
pam-pgsql-0_1 | postgresql-repmgr 08:43:55.70 Welcome to the Bitnami postgresql-repmgr container
pam-pgsql-0_1 | postgresql-repmgr 08:43:55.70 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-postgresql-repmgr
pam-pgsql-0_1 | postgresql-repmgr 08:43:55.70 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-postgresql-repmgr/issues
pam-pgsql-0_1 | postgresql-repmgr 08:43:55.71
pam-pgsql-0_1 | postgresql-repmgr 08:43:55.72 INFO ==> ** Starting PostgreSQL with Replication Manager setup **
pam-pgsql-0_1 | postgresql-repmgr 08:43:55.75 INFO ==> Validating settings in REPMGR_* env vars...
pam-pgsql-0_1 | postgresql-repmgr 08:43:55.76 INFO ==> Validating settings in POSTGRESQL_* env vars..
pam-pgsql-0_1 | postgresql-repmgr 08:43:55.76 INFO ==> Querying all partner nodes for common upstream node...
pam-pgsql-0_1 | postgresql-repmgr 08:43:55.83 INFO ==> Auto-detected primary node: '10.47.154.107:5432'
pam-pgsql-0_1 | postgresql-repmgr 08:43:55.84 INFO ==> Preparing PostgreSQL configuration...
pam-pgsql-0_1 | postgresql-repmgr 08:43:55.84 INFO ==> postgresql.conf file not detected. Generating it...
pam-pgsql-0_1 | postgresql-repmgr 08:43:55.91 INFO ==> Preparing repmgr configuration...
pam-pgsql-0_1 | postgresql-repmgr 08:43:55.92 INFO ==> Initializing Repmgr...
pam-pgsql-0_1 | postgresql-repmgr 08:43:55.92 INFO ==> Waiting for primary node...
pam-pgsql-0_1 | postgresql-repmgr 08:43:55.96 INFO ==> Cloning data from primary node...
pam-pgsql-0_1 | postgresql-repmgr 08:43:56.68 INFO ==> Initializing PostgreSQL database...
pam-pgsql-0_1 | postgresql-repmgr 08:43:56.68 INFO ==> Cleaning stale /bitnami/postgresql/data/standby.signal file
pam-pgsql-0_1 | postgresql-repmgr 08:43:56.69 INFO ==> Custom configuration /opt/bitnami/postgresql/conf/postgresql.conf detected
pam-pgsql-0_1 | postgresql-repmgr 08:43:56.69 INFO ==> Custom configuration /opt/bitnami/postgresql/conf/pg_hba.conf detected
pam-pgsql-0_1 | postgresql-repmgr 08:43:56.71 INFO ==> Deploying PostgreSQL with persisted data...
pam-pgsql-0_1 | postgresql-repmgr 08:43:56.73 INFO ==> Configuring replication parameters
pam-pgsql-0_1 | postgresql-repmgr 08:43:56.75 INFO ==> Configuring fsync
pam-pgsql-0_1 | postgresql-repmgr 08:43:56.77 INFO ==> Setting up streaming replication slave...
pam-pgsql-0_1 | postgresql-repmgr 08:43:56.80 INFO ==> Starting PostgreSQL in background...
pam-pgsql-0_1 | postgresql-repmgr 08:43:56.92 INFO ==> Unregistering standby node...
pam-pgsql-0_1 | postgresql-repmgr 08:43:56.98 INFO ==> Registering Standby node...
pam-pgsql-0_1 | postgresql-repmgr 08:43:57.03 INFO ==> Stopping PostgreSQL...
pam-pgsql-0_1 | postgresql-repmgr 08:43:58.05 INFO ==> ** PostgreSQL with Replication Manager setup finished! **
pam-pgsql-0_1 |
pam-pgsql-0_1 | postgresql-repmgr 08:43:58.07 INFO ==> Starting PostgreSQL in background...
pam-pgsql-0_1 | postgresql-repmgr 08:43:58.20 INFO ==> ** Starting repmgrd **
pam-pgsql-0_1 | [2020-10-17 08:43:58] [NOTICE] repmgrd (repmgrd 5.1.0) starting up
pam-pgsql-0_1 | [2020-10-17 08:43:58] [ERROR] PID file "/opt/bitnami/repmgr/tmp/repmgr.pid" exists and seems to contain a valid PID
pam-pgsql-0_1 | [2020-10-17 08:43:58] [HINT] if repmgrd is no longer alive, remove the file and restart repmgrd
pam-pgsql-0_1 | postgresql-repmgr 08:44:01.82
pam-pgsql-0_1 | postgresql-repmgr 08:44:01.82 Welcome to the Bitnami postgresql-repmgr container
Describe the results you expected: A starts again, postgresql starts and reconnects as a standby node.
Additional information you deem important (e.g. issue happens only occasionally):
when I remove /opt/bitnami/repmgr/tmp/repmgr.pid
, everything is back to normal.
Hi,
This is weird
pam-pgsql-0_1 | [2020-10-17 08:43:53] [ERROR] PID file "/opt/bitnami/repmgr/tmp/repmgr.pid" exists and seems to contain a valid PID
We've been dealing with race condition issues in the past but this one does not seem to be related. Pinging @rafariossaa as he has been dealing with startup issues in the past.
change
- REPMGR_PARTNER_NODES=10.47.154.106,10.47.154.107
to
- REPMGR_PARTNER_NODES=10.47.154.106,10.47.154.107:5432
solved my problem.
- REPMGR_PARTNER_NODES=10.47.154.106,10.47.154.107:5432
solved my problem.
Good to hear, but this cannot be a solution. The example docker-compose.yml is misleading. You are free to specify a port, but it is not mandatory to add the port to the last host entry:
- REPMGR_PARTNER_NODES=pg-0,pg-1:5432
The entries here are evaluated in librepmgr.sh and if no port was specified for a host, the default port is added for that host like in librepmgr.sh:
port="${port:-$REPMGR_PRIMARY_PORT}"
A solution is still required to handle old repmgr PID files.
Since repmgrd 4.1 the parameter --pid-file
is not required anymore: repmgrd-daemon.html. But this image here still uses it in run.sh:
readonly repmgr_flags=("--pid-file=$REPMGR_PID_FILE" "-f" "$REPMGR_CONF_FILE" "--daemonize=false")
A startup of repmgr with an existing PID file of a docker container specified as --pid-file
will force repmgr just to exit 3
. It does not delete the PID file. repmgrd.c
Approaches can be:
- check for an old PID file and remove it. This is already implemented for the psql pid file.
- remove the
--pid-file
parameter and let repmgr try to manage the pid file on its own. An alternative can be to use--no-pid-file
in this docker environment?
But why does repmgr not remove the PID file in the first place, if the system is rebooted?
https://github.com/EnterpriseDB/repmgr/issues/517#issuecomment-459602213
Hi @reduakt ,
Thanks for your feedback.
I have created a task so the team can review it. Unfortunately, I cannot provide you with an ETA.
Thanks for reporting this issue. Would you like to contribute by creating a PR to solve the issue? The Bitnami team will be happy to review it and provide feedback. Here you can find the contributing guidelines.
We are going to transfer this issue to bitnami/containers
In order to unify the approaches followed in Bitnami containers and Bitnami charts, we are moving some issues in bitnami/bitnami-docker-<container>
repositories to bitnami/containers
.
Please follow bitnami/containers to keep you updated about the latest bitnami images.
More information here: https://blog.bitnami.com/2022/07/new-source-of-truth-bitnami-containers.html
Hi, A new release removing the PID file has just being made. Could you give it a try ?
I am closing this issue. Please, if you find further errors don't hesitate to reopen it.