postgres-operator
postgres-operator copied to clipboard
pg: cannot set transaction read write mode during recovery
I have two pod in kubernetes, postgres-service-0 and postgres-service-1. First postgres-service-1 is master. After PostgreSQL master-slave failover, Application client connection PostgreSQL error:"pg: cannot set transaction read write mode during recovery". But PostgreSQL pods are all running. When I restart the application, it can be restored.
-
Which image of the operator are you using? registry.opensource.zalan.do/acid/postgres-operator:v1.8.2 spilo-14:2.1-p7
-
Where do you run it - cloud or metal? Kubernetes or OpenShift? Kubernetes
-
Are you running Postgres Operator in production? [yes | no] yes
postgres-service-1
2023-06-13 07:31:47,890 INFO: Lock owner: postgres-service-1; I am postgres-service-1
2023-06-13 07:31:52,898 ERROR: Request to server https://10.96.0.1:443 failed: ReadTimeoutError(\"HTTPSConnectionPool(host='10.96.0.1', port=443): Read timed out. (read timeout=4.9869120344519615)\",)
2023-06-13 07:31:53,858 WARNING: Concurrent update of postgres-service
2023-06-13 07:31:54,172 INFO: starting after demotion in progress
2023-06-13 07:31:54,174 INFO: Lock owner: postgres-service-0; I am postgres-service-1
2023-06-13 07:31:54,174 INFO: establishing a new patroni connection to the postgres cluster
2023-06-13 07:31:54,181 INFO: Local timeline=4 lsn=11/EC60FAD8
2023-06-13 07:31:54,212 INFO: master_timeline=5
2023-06-13 07:31:54,213 INFO: master: history=1\u00090/570000A0\u0009no recovery target specified
2\u00094/2248FB28\u0009no recovery target specified
3\u0009A/D3FE9300\u0009no recovery target specified
4\u000911/EC60FAD8\u0009no recovery target specified
server signaled
2023-06-13 07:31:54,323 INFO: no action. I am (postgres-service-1), a secondary, and following a leader (postgres-service-0)
2023-06-13 07:31:54,325 INFO: Lock owner: postgres-service-0; I am postgres-service-1
2023-06-13 07:31:54,330 INFO: Local timeline=4 lsn=11/EC60FAD8
2023-06-13 07:31:54,361 INFO: master_timeline=5
2023-06-13 07:31:54,362 INFO: master: history=1\u00090/570000A0\u0009no recovery target specified
2\u00094/2248FB28\u0009no recovery target specified
3\u0009A/D3FE9300\u0009no recovery target specified
4\u000911/EC60FAD8\u0009no recovery target specified
2023-06-13 07:31:54,372 INFO: no action. I am (postgres-service-1), a secondary, and following a leader (postgres-service-0)
postgres-service-0 log:
Got response from postgres-service-1 http://10.244.1.77:8008/patroni: {"state": "running", "postmaster_start_time": "2023-06-13 07:31:47.021588+00:00", "role": "replica", "server_version": 140005, "xlog": {"received_location": 76980222680, "replayed_location": 76980222680, "replayed_timestamp": "2023-06-13 07:31:41.062583+00:00", "paused": false}, "timeline": 4, "replication": [{"usename": "standby", "application_name": "postgres-service-0", "client_addr": "10.244.2.24", "state": "streaming", "sync_state": "async", "sync_priority": 0}], "dcs_last_seen": 1686641507, "database_system_identifier": "7215965247549096005", "patroni": {"version": "2.1.4", "scope": "postgres-service"}}
2023-06-13 07:30:43,392 WARNING: Could not activate Linux watchdog device: "Can't open watchdog device: [Errno 2] No such file or directory: '/dev/watchdog'"
2023-06-13 07:30:43,522 INFO: promoted self to leader by acquiring session lock
server promoting
2023-06-13 07:30:43,549 INFO: cleared rewind state after becoming the leader
2023-06-13 07:30:43,524 INFO: Lock owner: postgres-service-0; I am postgres-service-0
2023-06-13 07:30:43,893 INFO: updated leader lock during promote
Some general remarks when posting a bug report:
- Please, check the operator, pod (Patroni) and postgresql logs first. When copy-pasting many log lines please do it in a separate GitHub gist together with your Postgres CRD and configuration manifest.
- If you feel this issue might be more related to the Spilo docker image or Patroni, consider opening issues in the respective repos.
same problem
Observed the exact same issue more than once on an instance, resolves if the read replica is restarted
same problem
same issue
same here
Still reproducible on latest operator v1.12.2. Any updates about this issue?
Same problem
Same problem