postgres-operator
postgres-operator copied to clipboard
FATAL: password authentication failed for user "standby"
Please, answer some short questions which should help us to understand your problem / question better?
- Which image of the operator are you using?
- v1.8.2
- Where do you run it - cloud or metal? Kubernetes or OpenShift? [AWS K8s | GCP ... | Bare Metal K8s]
- Kubernetes
- Are you running Postgres Operator in production? [yes | no]
- no
- Type of issue? [Bug report, question, feature request, etc.] question: I want to use the standby cluster so I just use the demo from manifests. First, I create a cluster like this: kubectl create -f manifests/minimal-postgres-manifest.yaml Second, I create a standby like this: kubectl create -f manifests/standby-manifest.yaml meet an error-log but I don't know what happened and how to fix it. Error log like this:
2022-11-10 13:37:57,484 INFO: Selected new K8s API server endpoint https://172.16.3.44:6443 2022-11-10 13:37:57,544 INFO: No PostgreSQL configuration items changed, nothing to reload. 2022-11-10 13:37:57,551 INFO: Lock owner: None; I am acid-standby-cluster-0 2022-11-10 13:37:57,680 INFO: trying to bootstrap a new standby leader pg_basebackup: error: connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: password authentication failed for user "standby" password retrieved from file "/run/postgresql/pgpass" connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: no pg_hba.conf entry for replication connection from host "192.168.234.99", user "standby", no encryption pg_basebackup: error: connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: password authentication failed for user "standby" password retrieved from file "/run/postgresql/pgpass" connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: no pg_hba.conf entry for replication connection from host "192.168.234.99", user "standby", no encryption 2022-11-10 13:38:08,070 INFO: Lock owner: None; I am acid-standby-cluster-0 2022-11-10 13:38:08,070 INFO: not healthy enough for leader race 2022-11-10 13:38:08,108 INFO: bootstrap_standby_leader in progress 2022-11-10 13:38:18,063 INFO: Lock owner: None; I am acid-standby-cluster-0 2022-11-10 13:38:18,063 INFO: not healthy enough for leader race 2022-11-10 13:38:18,064 INFO: bootstrap_standby_leader in progress pg_basebackup: error: connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: password authentication failed for user "standby" password retrieved from file "/run/postgresql/pgpass" connection to server at "acid-minimal-cluster.default" (192.168.233.209), port 5432 failed: FATAL: no pg_hba.conf entry for replication connection from host "192.168.234.99", user "standby", no encryption 2022-11-10 13:38:28,052 ERROR: Error creating replica using method basebackup_fast_xlog: /scripts/basebackup.sh exited with code=1 2022-11-10 13:38:28,052 ERROR: failed to bootstrap clone from remote master postgresql://acid-minimal-cluster.default:5432 2022-11-10 13:38:28,053 INFO: Removing data directory: /home/postgres/pgdata/pgroot/data 2022-11-10 13:38:28,065 INFO: Lock owner: None; I am acid-standby-cluster-0 2022-11-10 13:38:28,065 INFO: not healthy enough for leader race 2022-11-10 13:38:28,143 INFO: bootstrap_standby_leader in progress 2022-11-10 13:38:38,073 INFO: removing initialize key after failed attempt to bootstrap the cluster Traceback (most recent call last): File "/usr/local/bin/patroni", line 11, in <module> sys.exit(main()) File "/usr/local/lib/python3.6/dist-packages/patroni/__main__.py", line 143, in main return patroni_main() File "/usr/local/lib/python3.6/dist-packages/patroni/__main__.py", line 135, in patroni_main abstract_main(Patroni, schema) File "/usr/local/lib/python3.6/dist-packages/patroni/daemon.py", line 100, in abstract_main controller.run() File "/usr/local/lib/python3.6/dist-packages/patroni/__main__.py", line 105, in run super(Patroni, self).run() File "/usr/local/lib/python3.6/dist-packages/patroni/daemon.py", line 59, in run self._run_cycle() File "/usr/local/lib/python3.6/dist-packages/patroni/__main__.py", line 108, in _run_cycle logger.info(self.ha.run_cycle()) File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1514, in run_cycle info = self._run_cycle() File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1388, in _run_cycle return self.post_bootstrap() File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1280, in post_bootstrap self.cancel_initialization() File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1273, in cancel_initialization raise PatroniFatalException('Failed to bootstrap cluster') patroni.exceptions.PatroniFatalException: 'Failed to bootstrap cluster' /etc/runit/runsvdir/default/patroni: finished with code=1 signal=0 /etc/runit/runsvdir/default/patroni: sleeping 30 seconds
thank you for your response
I found the error. It is because the password of user-standy is different with the password of source user-standy. And I want to know hot to set the password same for all users.
You need to define a secret to set a password
apiVersion: v1
stringData:
password: ${source_password}
username: standby
kind: Secret
metadata:
labels:
application: spilo
cluster-name: ${db_cluster}
team: audienti
name: standby.${db_cluster}.credentials.postgresql.acid.zalan.do
namespace: postgres
This is still happening to me on v1.12.2. As it is a FATAL error, it allegedly fails my cluster consistently every time I deploy (bare metal). Any tips?
$ kubectl exec -it -n postgres-operator postgres-cluster-0 -- patronictl show-config
failsafe_mode: false
loop_wait: 10
maximum_lag_on_failover: 33554432
postgresql:
parameters:
archive_mode: 'on'
archive_timeout: 1800s
autovacuum_analyze_scale_factor: 0.02
autovacuum_max_workers: 5
autovacuum_vacuum_scale_factor: 0.05
checkpoint_completion_target: 0.9
hot_standby: 'on'
log_autovacuum_min_duration: 0
log_checkpoints: 'on'
log_connections: 'on'
log_disconnections: 'on'
log_line_prefix: '%t [%p]: [%l-1] %c %x %d %u %a %h '
log_lock_waits: 'on'
log_min_duration_statement: 500
log_statement: ddl
log_temp_files: 0
max_connections: '640'
max_replication_slots: 10
max_wal_senders: 10
tcp_keepalives_idle: 900
tcp_keepalives_interval: 100
track_functions: all
wal_compression: 'on'
wal_level: hot_standby
wal_log_hints: 'on'
use_pg_rewind: true
use_slots: true
retry_timeout: 10
ttl: 30
And logs:
# kubectl logs -n postgres-operator postgres-cluster-2 | grep FATAL
psycopg2.OperationalError: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL: role "standby" does not exist
psycopg2.OperationalError: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL: role "standby" does not exist
psycopg2.OperationalError: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL: role "standby" does not exist
...
I follow the strategy where i don't depend on postgres operator to create the secrets for standby/pooler or super user. I create the secrets before hand before deploying the CRDs. I just make sure i follow the secret name convention as required by operator. That way, operator will not create new secrets for these for each CRD installation. This solved lots of issues for me as operator creates new passwords for these secrets every time while the PVs will have old secrets. So every time, new secret value will need to be updated in primary for super user and standby user. I hope this helps. @Danieloni1