charts
charts copied to clipboard
[bitnami/postgresql-ha] chart, second slave gets stuck waiting for primary node and CrashLoopBackOff
Hi Bitnami team :)
I am facing issue while using the postgresl-ha chart. The pb is that one of the 2 slave node failed to start because it get stuck waiting for the primary node.
You could fin more details below.
Thank you in advance for your help :)
Cheers!
Name and Version
bitnami/postgresql-ha latest (but with 13.2.4 version, I also reproduce)
What architecture are you using?
Kubernetes Kind on Ubuntu virtual machines
What steps will reproduce the bug?
-
In this environment Kind V0.10.0 Kubernetes 4 nodes cluster hosted on Ubuntu 22.04.2 LTS VMs
-
With this config No particular configuration for my test
-
run Execute the helm chart postgresql-ha with default values in order to deploy a redounded postgresql with 1 primary node and 2 slaves:
helm install bitnami-redounded oci://registry-1.docker.io/bitnamicharts/postgresql-ha --namespace test-bitnami
-
Issue After a while, the following artifacts are running but only one postgresql slave node is up and runnning synchronized with the primary node: the second slave seems to fail to connect to the primary since it is automatically restarted by kubelet after a timeout.
kubectl -n test-bitnami get all
NAME READY STATUS RESTARTS AGE
pod/bitnami-tsdb-redounded-postgresql-ha-pgpool-6464fdf9f6-kd5dd 1/1 Running 0 11m
pod/bitnami-tsdb-redounded-postgresql-ha-postgresql-0 1/1 Running 0 11m
pod/bitnami-tsdb-redounded-postgresql-ha-postgresql-1 0/1 CrashLoopBackOff 5 (101s ago) 11m
pod/bitnami-tsdb-redounded-postgresql-ha-postgresql-2 1/1 Running 0 11m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/bitnami-tsdb-redounded-postgresql-ha-pgpool ClusterIP 10.96.133.175 <none> 5432/TCP 11m
service/bitnami-tsdb-redounded-postgresql-ha-postgresql ClusterIP 10.96.30.24 <none> 5432/TCP 11m
service/bitnami-tsdb-redounded-postgresql-ha-postgresql-headless ClusterIP None <none> 5432/TCP 11m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/bitnami-tsdb-redounded-postgresql-ha-pgpool 1/1 1 1 11m
NAME DESIRED CURRENT READY AGE
replicaset.apps/bitnami-tsdb-redounded-postgresql-ha-pgpool-6464fdf9f6 1 1 1 11m
NAME READY AGE
statefulset.apps/bitnami-tsdb-redounded-postgresql-ha-postgresql 2/3 11m
here below , the logs of the second slave which fails to run. It get stuck waiting for the primary node until a timeout when kubelet restart the container:
kubectl -n test-bitnami logs -f bitnami-tsdb-redounded-postgresql-ha-postgresql-1
postgresql-repmgr 23:17:58.62 INFO ==>
postgresql-repmgr 23:17:58.62 INFO ==> Welcome to the Bitnami postgresql-repmgr container
postgresql-repmgr 23:17:58.62 INFO ==> Subscribe to project updates by watching https://github.com/bitnami/containers
postgresql-repmgr 23:17:58.62 INFO ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
postgresql-repmgr 23:17:58.62 INFO ==>
postgresql-repmgr 23:17:58.63 INFO ==> ** Starting PostgreSQL with Replication Manager setup **
postgresql-repmgr 23:17:58.65 INFO ==> Validating settings in REPMGR_* env vars...
postgresql-repmgr 23:17:58.65 INFO ==> Validating settings in POSTGRESQL_* env vars..
postgresql-repmgr 23:17:58.65 INFO ==> Querying all partner nodes for common upstream node...
postgresql-repmgr 23:18:03.69 INFO ==> Node configured as standby
postgresql-repmgr 23:18:03.70 INFO ==> Preparing PostgreSQL configuration...
postgresql-repmgr 23:18:03.70 INFO ==> postgresql.conf file not detected. Generating it...
postgresql-repmgr 23:18:03.76 INFO ==> Preparing repmgr configuration...
postgresql-repmgr 23:18:03.77 INFO ==> Initializing Repmgr...
postgresql-repmgr 23:18:03.77 INFO ==> Waiting for primary node...
here below the logs of the slave which successfully started-up :
kubectl -n test-bitnami logs -f bitnami-tsdb-redounded-postgresql-ha-postgresql-2
postgresql-repmgr 23:09:33.76 INFO ==>
postgresql-repmgr 23:09:33.76 INFO ==> Welcome to the Bitnami postgresql-repmgr container
postgresql-repmgr 23:09:33.77 INFO ==> Subscribe to project updates by watching https://github.com/bitnami/containers
postgresql-repmgr 23:09:33.77 INFO ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
postgresql-repmgr 23:09:33.77 INFO ==>
postgresql-repmgr 23:09:33.78 INFO ==> ** Starting PostgreSQL with Replication Manager setup **
postgresql-repmgr 23:09:33.79 INFO ==> Validating settings in REPMGR_* env vars...
postgresql-repmgr 23:09:33.79 INFO ==> Validating settings in POSTGRESQL_* env vars..
postgresql-repmgr 23:09:33.80 INFO ==> Querying all partner nodes for common upstream node...
postgresql-repmgr 23:09:34.09 INFO ==> Node configured as standby
postgresql-repmgr 23:09:34.09 INFO ==> Preparing PostgreSQL configuration...
postgresql-repmgr 23:09:34.10 INFO ==> postgresql.conf file not detected. Generating it...
postgresql-repmgr 23:09:34.17 INFO ==> Preparing repmgr configuration...
postgresql-repmgr 23:09:34.18 INFO ==> Initializing Repmgr...
postgresql-repmgr 23:09:34.18 INFO ==> Waiting for primary node...
postgresql-repmgr 23:09:54.23 INFO ==> Rejoining node...
postgresql-repmgr 23:09:54.23 INFO ==> Cloning data from primary node...
postgresql-repmgr 23:09:55.32 INFO ==> Initializing PostgreSQL database...
postgresql-repmgr 23:09:55.33 INFO ==> Custom configuration /opt/bitnami/postgresql/conf/postgresql.conf detected
postgresql-repmgr 23:09:55.33 INFO ==> Custom configuration /opt/bitnami/postgresql/conf/pg_hba.conf detected
postgresql-repmgr 23:09:55.34 INFO ==> Deploying PostgreSQL with persisted data...
postgresql-repmgr 23:09:55.35 INFO ==> Configuring replication parameters
postgresql-repmgr 23:09:55.37 INFO ==> Configuring fsync
postgresql-repmgr 23:09:55.38 INFO ==> Setting up streaming replication slave...
postgresql-repmgr 23:09:55.39 INFO ==> Starting PostgreSQL in background...
postgresql-repmgr 23:09:56.20 INFO ==> Unregistering standby node...
postgresql-repmgr 23:09:56.21 INFO ==> Registering Standby node...
postgresql-repmgr 23:09:56.26 INFO ==> Stopping PostgreSQL...
waiting for server to shut down.... done
server stopped
postgresql-repmgr 23:09:56.37 INFO ==> ** PostgreSQL with Replication Manager setup finished! **
postgresql-repmgr 23:09:56.38 INFO ==> Starting PostgreSQL in background...
waiting for server to start....2024-02-18 23:09:56.400 GMT [234] LOG: pgaudit extension initialized
2024-02-18 23:09:56.408 GMT [234] LOG: redirecting log output to logging collector process
2024-02-18 23:09:56.408 GMT [234] HINT: Future log output will appear in directory "/opt/bitnami/postgresql/logs".
2024-02-18 23:09:56.408 GMT [234] LOG: starting PostgreSQL 16.2 on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
2024-02-18 23:09:56.440 GMT [234] LOG: listening on IPv4 address "0.0.0.0", port 5432
2024-02-18 23:09:56.462 GMT [234] LOG: listening on IPv6 address "::", port 5432
2024-02-18 23:09:56.482 GMT [234] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2024-02-18 23:09:56.491 GMT [238] LOG: database system was shut down in recovery at 2024-02-18 23:09:56 GMT
2024-02-18 23:09:56.491 GMT [238] LOG: entering standby mode
2024-02-18 23:09:56.495 GMT [238] LOG: redo starts at 0/5000028
2024-02-18 23:09:56.496 GMT [238] LOG: consistent recovery state reached at 0/6000830
2024-02-18 23:09:56.496 GMT [238] LOG: invalid record length at 0/6000830: expected at least 24, got 0
2024-02-18 23:09:56.496 GMT [234] LOG: database system is ready to accept read-only connections
2024-02-18 23:09:56.508 GMT [239] LOG: started streaming WAL from primary at 0/6000000 on timeline 1
done
server started
postgresql-repmgr 23:09:56.59 INFO ==> ** Starting repmgrd **
[2024-02-18 23:09:56] [NOTICE] repmgrd (repmgrd 5.3.3) starting up
INFO: set_repmgrd_pid(): provided pidfile is /tmp/repmgrd.pid
[2024-02-18 23:09:56] [NOTICE] starting monitoring of node "bitnami-tsdb-redounded-postgresql-ha-postgresql-2" (ID: 1002)
2024-02-18 23:15:11.614 GMT [236] LOG: restartpoint starting: time
2024-02-18 23:15:15.813 GMT [236] LOG: restartpoint complete: wrote 43 buffers (0.3%); 0 WAL file(s) added, 0 removed, 0 recycled; write=4.153 s, sync=0.021 s, total=4.200 s; sync files=14, longest=0.013 s, average=0.002 s; distance=16647 kB, estimate=16647 kB; lsn=0/6041F78, redo lsn=0/6041F40
2024-02-18 23:15:15.813 GMT [236] LOG: recovery restart point at 0/6041F40
2024-02-18 23:15:15.813 GMT [236] DETAIL: Last completed transaction was at log time 2024-02-18 23:10:00.192974+00.
And finally, the logs of the primary node. We can see that only the postgresql-2 slave node has connected to the primary:
kubectl -n test-bitnami logs -f bitnami-tsdb-redounded-postgresql-ha-postgresql-0
postgresql-repmgr 23:09:36.79 INFO ==>
postgresql-repmgr 23:09:36.79 INFO ==> Welcome to the Bitnami postgresql-repmgr container
postgresql-repmgr 23:09:36.79 INFO ==> Subscribe to project updates by watching https://github.com/bitnami/containers
postgresql-repmgr 23:09:36.79 INFO ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
postgresql-repmgr 23:09:36.79 INFO ==>
postgresql-repmgr 23:09:36.80 INFO ==> ** Starting PostgreSQL with Replication Manager setup **
postgresql-repmgr 23:09:36.82 INFO ==> Validating settings in REPMGR_* env vars...
postgresql-repmgr 23:09:36.82 INFO ==> Validating settings in POSTGRESQL_* env vars..
postgresql-repmgr 23:09:36.82 INFO ==> Querying all partner nodes for common upstream node...
postgresql-repmgr 23:09:36.87 INFO ==> There are no nodes with primary role. Assuming the primary role...
postgresql-repmgr 23:09:36.87 INFO ==> Preparing PostgreSQL configuration...
postgresql-repmgr 23:09:36.87 INFO ==> postgresql.conf file not detected. Generating it...
postgresql-repmgr 23:09:36.93 INFO ==> Preparing repmgr configuration...
postgresql-repmgr 23:09:36.94 INFO ==> Initializing Repmgr...
postgresql-repmgr 23:09:36.95 INFO ==> Initializing PostgreSQL database...
postgresql-repmgr 23:09:36.95 INFO ==> Custom configuration /opt/bitnami/postgresql/conf/postgresql.conf detected
postgresql-repmgr 23:09:36.95 INFO ==> Custom configuration /opt/bitnami/postgresql/conf/pg_hba.conf detected
postgresql-repmgr 23:09:38.22 INFO ==> Starting PostgreSQL in background...
postgresql-repmgr 23:09:38.44 INFO ==> Changing password of postgres
postgresql-repmgr 23:09:38.45 INFO ==> Creating replication user repmgr
postgresql-repmgr 23:09:38.47 INFO ==> Stopping PostgreSQL...
waiting for server to shut down.... done
server stopped
postgresql-repmgr 23:09:38.68 INFO ==> Configuring replication parameters
postgresql-repmgr 23:09:38.70 INFO ==> Configuring fsync
postgresql-repmgr 23:09:38.70 INFO ==> Starting PostgreSQL in background...
postgresql-repmgr 23:09:38.92 INFO ==> Creating repmgr user: repmgr
postgresql-repmgr 23:09:38.96 INFO ==> Creating repmgr database: repmgr
postgresql-repmgr 23:09:39.03 INFO ==> Stopping PostgreSQL...
waiting for server to shut down.... done
server stopped
postgresql-repmgr 23:09:39.43 INFO ==> Starting PostgreSQL in background...
postgresql-repmgr 23:09:39.64 INFO ==> Registering Primary...
postgresql-repmgr 23:09:39.75 INFO ==> Loading custom scripts...
postgresql-repmgr 23:09:39.75 INFO ==> Configuring synchronous_replication
postgresql-repmgr 23:09:39.76 INFO ==> Stopping PostgreSQL...
waiting for server to shut down.... done
server stopped
postgresql-repmgr 23:09:39.96 INFO ==> ** PostgreSQL with Replication Manager setup finished! **
postgresql-repmgr 23:09:39.98 INFO ==> Starting PostgreSQL in background...
waiting for server to start....2024-02-18 23:09:39.999 GMT [290] LOG: pgaudit extension initialized
2024-02-18 23:09:40.010 GMT [290] LOG: redirecting log output to logging collector process
2024-02-18 23:09:40.010 GMT [290] HINT: Future log output will appear in directory "/opt/bitnami/postgresql/logs".
2024-02-18 23:09:40.010 GMT [290] LOG: starting PostgreSQL 16.2 on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
2024-02-18 23:09:40.040 GMT [290] LOG: listening on IPv4 address "0.0.0.0", port 5432
2024-02-18 23:09:40.061 GMT [290] LOG: listening on IPv6 address "::", port 5432
2024-02-18 23:09:40.093 GMT [290] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2024-02-18 23:09:40.107 GMT [294] LOG: database system was shut down at 2024-02-18 23:09:39 GMT
2024-02-18 23:09:40.115 GMT [290] LOG: database system is ready to accept connections
done
server started
postgresql-repmgr 23:09:40.19 INFO ==> ** Starting repmgrd **
[2024-02-18 23:09:40] [NOTICE] repmgrd (repmgrd 5.3.3) starting up
INFO: set_repmgrd_pid(): provided pidfile is /tmp/repmgrd.pid
[2024-02-18 23:09:40] [NOTICE] starting monitoring of node "bitnami-tsdb-redounded-postgresql-ha-postgresql-0" (ID: 1000)
[2024-02-18 23:09:40] [NOTICE] monitoring cluster primary "bitnami-tsdb-redounded-postgresql-ha-postgresql-0" (ID: 1000)
2024-02-18 23:09:54.344 GMT [292] LOG: checkpoint starting: immediate force wait
2024-02-18 23:09:54.412 GMT [292] LOG: checkpoint complete: wrote 13 buffers (0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.009 s, sync=0.012 s, total=0.069 s; sync files=8, longest=0.005 s, average=0.002 s; distance=16384 kB, estimate=16384 kB; lsn=0/5000060, redo lsn=0/5000028
[2024-02-18 23:09:58] [NOTICE] new standby "bitnami-tsdb-redounded-postgresql-ha-postgresql-2" (ID: 1002) has connected
2024-02-18 23:14:54.508 GMT [292] LOG: checkpoint starting: time
2024-02-18 23:14:58.993 GMT [292] LOG: checkpoint complete: wrote 45 buffers (0.3%); 0 WAL file(s) added, 0 removed, 0 recycled; write=4.431 s, sync=0.014 s, total=4.485 s; sync files=13, longest=0.004 s, average=0.001 s; distance=16647 kB, estimate=16647 kB; lsn=0/6041F78, redo lsn=0/6041F40
Are you using any custom parameters or values?
No custom values, only the default ones provided by the chart
What is the expected behavior?
The expected behavior is to have 2 slaves up and running
What do you see instead?
We only see one the primary node up and running with one slave, but the second one is CrashloopBackoff because it gets stuck to wait for the primary.
Additional information
In case it can help, here below the result of kubectl describe
of the 3 pods (the primary one and the two slaves).
The primary node :
kubectl -n test-bitnami describe pod bitnami-tsdb-redounded-postgresql-ha-postgresql-0
Name: bitnami-tsdb-redounded-postgresql-ha-postgresql-0
Namespace: test-bitnami
Priority: 0
Service Account: bitnami-tsdb-redounded-postgresql-ha
Node: datahub-local-worker2/172.18.0.3
Start Time: Mon, 19 Feb 2024 00:09:23 +0100
Labels: app.kubernetes.io/component=postgresql
app.kubernetes.io/instance=bitnami-tsdb-redounded
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=postgresql-ha
app.kubernetes.io/version=16.2.0
controller-revision-hash=bitnami-tsdb-redounded-postgresql-ha-postgresql-b78c9db67
helm.sh/chart=postgresql-ha-13.3.3
role=data
statefulset.kubernetes.io/pod-name=bitnami-tsdb-redounded-postgresql-ha-postgresql-0
Annotations: <none>
Status: Running
IP: 10.244.1.75
IPs:
IP: 10.244.1.75
Controlled By: StatefulSet/bitnami-tsdb-redounded-postgresql-ha-postgresql
Containers:
postgresql:
Container ID: containerd://1c2e93b0beef08fc9008144962764a486685511fb1adeefbb42447e08b6cf3c3
Image: registry-1.docker.io/bitnami/postgresql-repmgr:16.2.0-debian-11-r18
Image ID: registry-1.docker.io/bitnami/postgresql-repmgr@sha256:2fbfb8169c474bf00a1f5c56556ad56fc4d7dc6d28350d7fbc94eb48e9cf6128
Port: 5432/TCP
Host Port: 0/TCP
State: Running
Started: Mon, 19 Feb 2024 00:09:36 +0100
Ready: True
Restart Count: 0
Liveness: exec [bash -ec PGPASSWORD=$POSTGRES_PASSWORD psql -w -U "postgres" -d "postgres" -h 127.0.0.1 -p 5432 -c "SELECT 1"] delay=30s timeout=5s period=10s #success=1 #failure=6
Readiness: exec [bash -ec PGPASSWORD=$POSTGRES_PASSWORD psql -w -U "postgres" -d "postgres" -h 127.0.0.1 -p 5432 -c "SELECT 1"] delay=5s timeout=5s period=10s #success=1 #failure=6
Environment:
BITNAMI_DEBUG: false
POSTGRESQL_VOLUME_DIR: /bitnami/postgresql
PGDATA: /bitnami/postgresql/data
POSTGRES_USER: postgres
POSTGRES_PASSWORD: <set to the key 'password' in secret 'bitnami-tsdb-redounded-postgresql-ha-postgresql'> Optional: false
POSTGRES_DB: postgres
POSTGRESQL_LOG_HOSTNAME: true
POSTGRESQL_LOG_CONNECTIONS: false
POSTGRESQL_LOG_DISCONNECTIONS: false
POSTGRESQL_PGAUDIT_LOG_CATALOG: off
POSTGRESQL_CLIENT_MIN_MESSAGES: error
POSTGRESQL_SHARED_PRELOAD_LIBRARIES: pgaudit, repmgr
POSTGRESQL_ENABLE_TLS: no
POSTGRESQL_PORT_NUMBER: 5432
REPMGR_PORT_NUMBER: 5432
REPMGR_PRIMARY_PORT: 5432
MY_POD_NAME: bitnami-tsdb-redounded-postgresql-ha-postgresql-0 (v1:metadata.name)
REPMGR_UPGRADE_EXTENSION: no
REPMGR_PGHBA_TRUST_ALL: no
REPMGR_MOUNTED_CONF_DIR: /bitnami/repmgr/conf
REPMGR_NAMESPACE: test-bitnami (v1:metadata.namespace)
REPMGR_PARTNER_NODES: bitnami-tsdb-redounded-postgresql-ha-postgresql-0.bitnami-tsdb-redounded-postgresql-ha-postgresql-headless.$(REPMGR_NAMESPACE).svc.cluster.local,bitnami-tsdb-redounded-postgresql-ha-postgresql-1.bitnami-tsdb-redounded-postgresql-ha-postgresql-headless.$(REPMGR_NAMESPACE).svc.cluster.local,bitnami-tsdb-redounded-postgresql-ha-postgresql-2.bitnami-tsdb-redounded-postgresql-ha-postgresql-headless.$(REPMGR_NAMESPACE).svc.cluster.local,
REPMGR_PRIMARY_HOST: bitnami-tsdb-redounded-postgresql-ha-postgresql-0.bitnami-tsdb-redounded-postgresql-ha-postgresql-headless.$(REPMGR_NAMESPACE).svc.cluster.local
REPMGR_NODE_NAME: $(MY_POD_NAME)
REPMGR_NODE_NETWORK_NAME: $(MY_POD_NAME).bitnami-tsdb-redounded-postgresql-ha-postgresql-headless.$(REPMGR_NAMESPACE).svc.cluster.local
REPMGR_NODE_TYPE: data
REPMGR_LOG_LEVEL: NOTICE
REPMGR_CONNECT_TIMEOUT: 5
REPMGR_RECONNECT_ATTEMPTS: 2
REPMGR_RECONNECT_INTERVAL: 3
REPMGR_USERNAME: repmgr
REPMGR_PASSWORD: <set to the key 'repmgr-password' in secret 'bitnami-tsdb-redounded-postgresql-ha-postgresql'> Optional: false
REPMGR_DATABASE: repmgr
REPMGR_FENCE_OLD_PRIMARY: no
REPMGR_CHILD_NODES_CHECK_INTERVAL: 5
REPMGR_CHILD_NODES_CONNECTED_MIN_COUNT: 1
REPMGR_CHILD_NODES_DISCONNECT_TIMEOUT: 30
Mounts:
/bitnami/postgresql from data (rw)
/pre-stop.sh from hooks-scripts (rw,path="pre-stop.sh")
/readiness-probe.sh from hooks-scripts (rw,path="readiness-probe.sh")
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-bitnami-tsdb-redounded-postgresql-ha-postgresql-0
ReadOnly: false
hooks-scripts:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: bitnami-tsdb-redounded-postgresql-ha-postgresql-hooks-scripts
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
The slave which succeeds to start
kubectl -n test-bitnami describe pod bitnami-tsdb-redounded-postgresql-ha-postgresql-2
Name: bitnami-tsdb-redounded-postgresql-ha-postgresql-2
Namespace: test-bitnami
Priority: 0
Service Account: bitnami-tsdb-redounded-postgresql-ha
Node: datahub-local-worker3/172.18.0.2
Start Time: Mon, 19 Feb 2024 00:09:22 +0100
Labels: app.kubernetes.io/component=postgresql
app.kubernetes.io/instance=bitnami-tsdb-redounded
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=postgresql-ha
app.kubernetes.io/version=16.2.0
controller-revision-hash=bitnami-tsdb-redounded-postgresql-ha-postgresql-b78c9db67
helm.sh/chart=postgresql-ha-13.3.3
role=data
statefulset.kubernetes.io/pod-name=bitnami-tsdb-redounded-postgresql-ha-postgresql-2
Annotations: <none>
Status: Running
IP: 10.244.3.59
IPs:
IP: 10.244.3.59
Controlled By: StatefulSet/bitnami-tsdb-redounded-postgresql-ha-postgresql
Containers:
postgresql:
Container ID: containerd://85ed5c264b45ca8d3891971f5a660aae7637aad1d65ad43867333bd1bf279079
Image: registry-1.docker.io/bitnami/postgresql-repmgr:16.2.0-debian-11-r18
Image ID: registry-1.docker.io/bitnami/postgresql-repmgr@sha256:2fbfb8169c474bf00a1f5c56556ad56fc4d7dc6d28350d7fbc94eb48e9cf6128
Port: 5432/TCP
Host Port: 0/TCP
State: Running
Started: Mon, 19 Feb 2024 00:09:33 +0100
Ready: True
Restart Count: 0
Liveness: exec [bash -ec PGPASSWORD=$POSTGRES_PASSWORD psql -w -U "postgres" -d "postgres" -h 127.0.0.1 -p 5432 -c "SELECT 1"] delay=30s timeout=5s period=10s #success=1 #failure=6
Readiness: exec [bash -ec PGPASSWORD=$POSTGRES_PASSWORD psql -w -U "postgres" -d "postgres" -h 127.0.0.1 -p 5432 -c "SELECT 1"] delay=5s timeout=5s period=10s #success=1 #failure=6
Environment:
BITNAMI_DEBUG: false
POSTGRESQL_VOLUME_DIR: /bitnami/postgresql
PGDATA: /bitnami/postgresql/data
POSTGRES_USER: postgres
POSTGRES_PASSWORD: <set to the key 'password' in secret 'bitnami-tsdb-redounded-postgresql-ha-postgresql'> Optional: false
POSTGRES_DB: postgres
POSTGRESQL_LOG_HOSTNAME: true
POSTGRESQL_LOG_CONNECTIONS: false
POSTGRESQL_LOG_DISCONNECTIONS: false
POSTGRESQL_PGAUDIT_LOG_CATALOG: off
POSTGRESQL_CLIENT_MIN_MESSAGES: error
POSTGRESQL_SHARED_PRELOAD_LIBRARIES: pgaudit, repmgr
POSTGRESQL_ENABLE_TLS: no
POSTGRESQL_PORT_NUMBER: 5432
REPMGR_PORT_NUMBER: 5432
REPMGR_PRIMARY_PORT: 5432
MY_POD_NAME: bitnami-tsdb-redounded-postgresql-ha-postgresql-2 (v1:metadata.name)
REPMGR_UPGRADE_EXTENSION: no
REPMGR_PGHBA_TRUST_ALL: no
REPMGR_MOUNTED_CONF_DIR: /bitnami/repmgr/conf
REPMGR_NAMESPACE: test-bitnami (v1:metadata.namespace)
REPMGR_PARTNER_NODES: bitnami-tsdb-redounded-postgresql-ha-postgresql-0.bitnami-tsdb-redounded-postgresql-ha-postgresql-headless.$(REPMGR_NAMESPACE).svc.cluster.local,bitnami-tsdb-redounded-postgresql-ha-postgresql-1.bitnami-tsdb-redounded-postgresql-ha-postgresql-headless.$(REPMGR_NAMESPACE).svc.cluster.local,bitnami-tsdb-redounded-postgresql-ha-postgresql-2.bitnami-tsdb-redounded-postgresql-ha-postgresql-headless.$(REPMGR_NAMESPACE).svc.cluster.local,
REPMGR_PRIMARY_HOST: bitnami-tsdb-redounded-postgresql-ha-postgresql-0.bitnami-tsdb-redounded-postgresql-ha-postgresql-headless.$(REPMGR_NAMESPACE).svc.cluster.local
REPMGR_NODE_NAME: $(MY_POD_NAME)
REPMGR_NODE_NETWORK_NAME: $(MY_POD_NAME).bitnami-tsdb-redounded-postgresql-ha-postgresql-headless.$(REPMGR_NAMESPACE).svc.cluster.local
REPMGR_NODE_TYPE: data
REPMGR_LOG_LEVEL: NOTICE
REPMGR_CONNECT_TIMEOUT: 5
REPMGR_RECONNECT_ATTEMPTS: 2
REPMGR_RECONNECT_INTERVAL: 3
REPMGR_USERNAME: repmgr
REPMGR_PASSWORD: <set to the key 'repmgr-password' in secret 'bitnami-tsdb-redounded-postgresql-ha-postgresql'> Optional: false
REPMGR_DATABASE: repmgr
REPMGR_FENCE_OLD_PRIMARY: no
REPMGR_CHILD_NODES_CHECK_INTERVAL: 5
REPMGR_CHILD_NODES_CONNECTED_MIN_COUNT: 1
REPMGR_CHILD_NODES_DISCONNECT_TIMEOUT: 30
Mounts:
/bitnami/postgresql from data (rw)
/pre-stop.sh from hooks-scripts (rw,path="pre-stop.sh")
/readiness-probe.sh from hooks-scripts (rw,path="readiness-probe.sh")
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-bitnami-tsdb-redounded-postgresql-ha-postgresql-2
ReadOnly: false
hooks-scripts:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: bitnami-tsdb-redounded-postgresql-ha-postgresql-hooks-scripts
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
The slave which fails to start (stuck waiting for the primary)
kubectl -n test-bitnami describe pod bitnami-tsdb-redounded-postgresql-ha-postgresql-1
Name: bitnami-tsdb-redounded-postgresql-ha-postgresql-1
Namespace: test-bitnami
Priority: 0
Service Account: bitnami-tsdb-redounded-postgresql-ha
Node: datahub-local-worker/172.18.0.5
Start Time: Mon, 19 Feb 2024 00:09:24 +0100
Labels: app.kubernetes.io/component=postgresql
app.kubernetes.io/instance=bitnami-tsdb-redounded
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=postgresql-ha
app.kubernetes.io/version=16.2.0
controller-revision-hash=bitnami-tsdb-redounded-postgresql-ha-postgresql-b78c9db67
helm.sh/chart=postgresql-ha-13.3.3
role=data
statefulset.kubernetes.io/pod-name=bitnami-tsdb-redounded-postgresql-ha-postgresql-1
Annotations: <none>
Status: Running
IP: 10.244.2.48
IPs:
IP: 10.244.2.48
Controlled By: StatefulSet/bitnami-tsdb-redounded-postgresql-ha-postgresql
Containers:
postgresql:
Container ID: containerd://437ff1b1ea9229fc0afe75a6433eed499498b287b4486f862d36e652ab7af097
Image: registry-1.docker.io/bitnami/postgresql-repmgr:16.2.0-debian-11-r18
Image ID: registry-1.docker.io/bitnami/postgresql-repmgr@sha256:2fbfb8169c474bf00a1f5c56556ad56fc4d7dc6d28350d7fbc94eb48e9cf6128
Port: 5432/TCP
Host Port: 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 19 Feb 2024 02:00:07 +0100
Finished: Mon, 19 Feb 2024 02:01:22 +0100
Ready: False
Restart Count: 31
Liveness: exec [bash -ec PGPASSWORD=$POSTGRES_PASSWORD psql -w -U "postgres" -d "postgres" -h 127.0.0.1 -p 5432 -c "SELECT 1"] delay=30s timeout=5s period=10s #success=1 #failure=6
Readiness: exec [bash -ec PGPASSWORD=$POSTGRES_PASSWORD psql -w -U "postgres" -d "postgres" -h 127.0.0.1 -p 5432 -c "SELECT 1"] delay=5s timeout=5s period=10s #success=1 #failure=6
Environment:
BITNAMI_DEBUG: false
POSTGRESQL_VOLUME_DIR: /bitnami/postgresql
PGDATA: /bitnami/postgresql/data
POSTGRES_USER: postgres
POSTGRES_PASSWORD: <set to the key 'password' in secret 'bitnami-tsdb-redounded-postgresql-ha-postgresql'> Optional: false
POSTGRES_DB: postgres
POSTGRESQL_LOG_HOSTNAME: true
POSTGRESQL_LOG_CONNECTIONS: false
POSTGRESQL_LOG_DISCONNECTIONS: false
POSTGRESQL_PGAUDIT_LOG_CATALOG: off
POSTGRESQL_CLIENT_MIN_MESSAGES: error
POSTGRESQL_SHARED_PRELOAD_LIBRARIES: pgaudit, repmgr
POSTGRESQL_ENABLE_TLS: no
POSTGRESQL_PORT_NUMBER: 5432
REPMGR_PORT_NUMBER: 5432
REPMGR_PRIMARY_PORT: 5432
MY_POD_NAME: bitnami-tsdb-redounded-postgresql-ha-postgresql-1 (v1:metadata.name)
REPMGR_UPGRADE_EXTENSION: no
REPMGR_PGHBA_TRUST_ALL: no
REPMGR_MOUNTED_CONF_DIR: /bitnami/repmgr/conf
REPMGR_NAMESPACE: test-bitnami (v1:metadata.namespace)
REPMGR_PARTNER_NODES: bitnami-tsdb-redounded-postgresql-ha-postgresql-0.bitnami-tsdb-redounded-postgresql-ha-postgresql-headless.$(REPMGR_NAMESPACE).svc.cluster.local,bitnami-tsdb-redounded-postgresql-ha-postgresql-1.bitnami-tsdb-redounded-postgresql-ha-postgresql-headless.$(REPMGR_NAMESPACE).svc.cluster.local,bitnami-tsdb-redounded-postgresql-ha-postgresql-2.bitnami-tsdb-redounded-postgresql-ha-postgresql-headless.$(REPMGR_NAMESPACE).svc.cluster.local,
REPMGR_PRIMARY_HOST: bitnami-tsdb-redounded-postgresql-ha-postgresql-0.bitnami-tsdb-redounded-postgresql-ha-postgresql-headless.$(REPMGR_NAMESPACE).svc.cluster.local
REPMGR_NODE_NAME: $(MY_POD_NAME)
REPMGR_NODE_NETWORK_NAME: $(MY_POD_NAME).bitnami-tsdb-redounded-postgresql-ha-postgresql-headless.$(REPMGR_NAMESPACE).svc.cluster.local
REPMGR_NODE_TYPE: data
REPMGR_LOG_LEVEL: NOTICE
REPMGR_CONNECT_TIMEOUT: 5
REPMGR_RECONNECT_ATTEMPTS: 2
REPMGR_RECONNECT_INTERVAL: 3
REPMGR_USERNAME: repmgr
REPMGR_PASSWORD: <set to the key 'repmgr-password' in secret 'bitnami-tsdb-redounded-postgresql-ha-postgresql'> Optional: false
REPMGR_DATABASE: repmgr
REPMGR_FENCE_OLD_PRIMARY: no
REPMGR_CHILD_NODES_CHECK_INTERVAL: 5
REPMGR_CHILD_NODES_CONNECTED_MIN_COUNT: 1
REPMGR_CHILD_NODES_DISCONNECT_TIMEOUT: 30
Mounts:
/bitnami/postgresql from data (rw)
/pre-stop.sh from hooks-scripts (rw,path="pre-stop.sh")
/readiness-probe.sh from hooks-scripts (rw,path="readiness-probe.sh")
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-bitnami-tsdb-redounded-postgresql-ha-postgresql-1
ReadOnly: false
hooks-scripts:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: bitnami-tsdb-redounded-postgresql-ha-postgresql-hooks-scripts
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 5m17s (x250 over 115m) kubelet Readiness probe failed: psql: error: connection to server at "127.0.0.1", port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?
Warning BackOff 21s (x357 over 111m) kubelet Back-off restarting failed container postgresql in pod bitnami-tsdb-redounded-postgresql-ha-postgresql-1_test-bitnami(3714cf1e-bff8-4710-b53a-3e2bb2a40026)
Hi @nleeuskadi ,
I was not able to reproduce the issue.
Could you launch the chart with postgresql.image.debug=true
? this may provide more insight on the issue.
Also, you could try disabling liveness/readiness probes for this postgres-ha cluster as a workaround, but not the ideal solution though.
Hello @dgomezleon ,
thank you for your help :)
i will try what you suggested with postgresql.image.debug=true
and get back to you.
Cheers.
Hi Bitnami community,
Since I openned the ticket, I did not reproduce it after reinstalling my K8s cluster. I guess my problem was due to something wrong in my cluster but cannot find out what exactly.
Thank you for your help.
I close the ticket.
Cheers !