containers
containers copied to clipboard
Synchronous replication support is very limited
Name and Version
bitnami/postgresql-repmgr:11.15.0-debian-10-r65
What steps will reproduce the bug?
- :information_source: Deployed using bitnami/postgresql-ha helm chart, version 8.6.13
- :information_source: in the config, nothing special, the most important is
postgresql.syncReplication=true
- :heavy_check_mark: Fresh installation works, synch. replication is set up on primary (so far so good).
- :heavy_check_mark: On the primary, postgresql.conf contains:
synchronous_commit = 'on'
synchronous_standby_names = '2 ("postgresql-ha-postgresql-0","postgresql-ha-postgresql-1","postgresql-ha-postgresql-2")'
Also, pg_stat_replication
shows sync
for both deployed replicas.
6. :heavy_check_mark: Delete primary pod. A new primary is elected, and the remaining standby now follows the newly promoted primary.
7. :red_circle: Synchronous replication is gone. Freshly promoted primary (former replica) is not aware of synchronous replication:
#synchronous_commit = on # synchronization level;
#synchronous_standby_names = '' # standby servers that provide sync rep
- :information_source: When the deleted pod is recreated, it joins the cluster as a new replica.
- :red_circle: There's no remaining trace that sync replication was ever turned on.
What is the expected behavior?
-
postgresql_configure_synchronous_replication
must run also on replicas so the configuration is prepared in case of promotion to primary. - allow configuring synchronous replication even if the data directory exists (I mean when
POSTGRESQL_FIRST_BOOT=no
).
What do you see instead?
As described above - when a new primary is elected, replication is set to async. When the original primary pod is recreated, replication is set to async and can not be changed back to synchronous.
Additional information
One more thing that comes to my mind: the current configuration is very strict - the loss of a single replica makes all ongoing transactions hang and wait until all replicas are back online. This behavior may not be desired in rapidly changing environments like Kubernetes - pods may fail, may be evicted, and so on. It is possible to relax the synchronous_standby_names
value by setting a lower number of replicas that need to confirm transaction, but helm chart hard-codes this value to .Values.postgresql.replicaCount - 1
. Also, it's impossible to choose between FIRST
and ANY
.
Reference: postresql.org
Hi @mouchar , Would you like to send a PR changing the configuration to reflect what you described?
I can try to send PR if I manage to comprehend the whole flow of these shell scripts. Are there any existing test suites that I could use to be sure I didn't break anything else?
Into the Helm Chart repository, once the PR is created, we can add the verify
label to check that. When developing, you will need to perform some manual test.
After the PR is merged our pipelines test properly several cases and usages.
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.
Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.
I would just add to comment that this is a very problematic scenario that leads to data loss in production
Hi @donburgess , Would you elaborate a bit more ?
Hi @donburgess , Would you elaborate a bit more ?
Yes, currently observed if a cluster is configured for synchronous mode and the master were to fail then during the fail over the cluster will permanently operate in an asynchronous mode without intervention. Depending on the lag time in the set this can lead to unmitigated loss of data during the next fail over. From an administrator point of view this could be considered catastrophic if the belief that the data was currently better protected by synchronous mode operations.
Do you have any suggestion on how to improve this ?
I don't know the project low level enough to give very specific guidance. I suspect two areas that could be the main problem. One is that when a replica takes over it has no configuration incentive to list other nodes as synchronous streams as it has a blank configuration. While I have not tested this scenario I suspect another cause could be that adding nodes to the cluster could be done as async replicas.
We are going to transfer this issue to bitnami/containers
In order to unify the approaches followed in Bitnami containers and Bitnami charts, we are moving some issues in bitnami/bitnami-docker-<container>
repositories to bitnami/containers
.
Please follow bitnami/containers to keep you updated about the latest bitnami images.
More information here: https://blog.bitnami.com/2022/07/new-source-of-truth-bitnami-containers.html
Hi, Sorry for the delay. I guess this is more a postgresql "issue" or an "issue" that could appear in postgresql in a container scenario, more than something that could be fixed in this container image. Maybe reporting this in a posgresql forum could bring some advice if something could be done in the image.
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.
We have the same issue, any information of roadmap or workarounds?
I am creating an internal task to relax the settings mentioned in the the Additional Information in the issue. We will come back as soon as we have news.
Any updates on this thread?
Hi,
Just a quick note to let you know that the latest version of our Helm chart for postgresql-ha
includes a new postgresql.syncReplicationMode
setting to configure the synchronous replication mode. You can get more information about this both in the README file and the values.yaml.
Hope it helps!
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.
Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.