repmgr icon indicating copy to clipboard operation
repmgr copied to clipboard

standby active after disconnecting, why?!

Open RekGRpth opened this issue 5 years ago • 4 comments

[2020-04-22 12:35:01] [NOTICE] standby node "repmgr2" (ID: 2) has disconnected
[2020-04-22 12:35:01] [INFO] executing notification command for event "child_node_disconnect"
[2020-04-22 12:35:01] [DETAIL] command is:
  /etc/service/repmgr/event "1" "child_node_disconnect" "1" "2020-04-22 12:35:01.948405+05" "standby node \"repmgr2\" (ID: 2) has disconnected"

but still

repmgr=# select * from nodes;
 node_id | upstream_node_id | active | node_name |  type   | location | priority |                         conninfo                         | repluser |   slot_name   |   config_file    
---------+------------------+--------+-----------+---------+----------+----------+----------------------------------------------------------+----------+---------------+------------------
       1 |                  | t      | repmgr1   | primary | default  |      100 | host=repmgr1 user=repmgr dbname=repmgr connect_timeout=2 | repmgr   | repmgr_slot_1 | /etc/repmgr.conf
       2 |                1 | t      | repmgr2   | standby | default  |      100 | host=repmgr2 user=repmgr dbname=repmgr connect_timeout=2 | repmgr   | repmgr_slot_2 | /etc/repmgr.conf
(2 rows)

RekGRpth avatar Apr 22 '20 07:04 RekGRpth

Here the standby is presumably still running; it is only marked inactive if it is not running.

ibarwick avatar Apr 22 '20 07:04 ibarwick

Here the standby is presumably still running

NO!!! I shutdown host with standby!

RekGRpth avatar Apr 22 '20 07:04 RekGRpth

Aha, in that case the repmgrd on the standby probably didn't get a chance to update the metadata. This is probably something we can improve on.

Please note that it is very helpful if you provide as much information as possible when asking questions/reporting issues, as it helps us understand the wider context of the situation and save us the trouble of trying to work out what was going on. See also: https://repmgr.org/docs/current/appendix-support-reporting-issues.html

ibarwick avatar Apr 22 '20 09:04 ibarwick

I have the same problem. When (i think) repmgrd dies before postgresql service than child_node_disconnect event is triggered (on primary) and the node records is not updated to false. Otherwise standby_failure event is triggered and active field of node record is updated to false.

PostgreSQL version

                                                     version
------------------------------------------------------------------------------------------------------------------
 PostgreSQL 12.4 (Debian 12.4-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit

repmgr version

repmgr --version
repmgr 5.0.0

How was repmgr installed? From source? From packages? If so from which repository?

apt install -y postgresql-12-repmgr

repmpgr.conf files (suitably anonymized if necessary)

repmgr.conf.txt

Contents of the repmgr.nodes table (suitably anonymized if necessary)

repmgr=# select * from nodes;
 node_id | upstream_node_id | active | node_name |  type   |    location     | priority |                          conninfo                          | repluser |   slot_name   |   config_file
---------+------------------+--------+-----------+---------+-----------------+----------+------------------------------------------------------------+----------+---------------+------------------
       1 |                  | t      | pgdb01    | primary | ***|      103 | host=pgdb01 user=repmgr connect_timeout=15 sslmode=disable | repmgr   | repmgr_slot_1 | /etc/repmgr.conf
       2 |                1 | t      | pgdb02    | standby | ***|      102 | host=pgdb02 user=repmgr connect_timeout=15 sslmode=disable | repmgr   | repmgr_slot_2 | /etc/repmgr.conf
       3 |                1 | t      | pgdb03    | witness | ***|        0 | host=pgdb03 user=repmgr connect_timeout=15 sslmode=disable | repmgr   | repmgr_slot_3 | /etc/repmgr.conf
(3 rows)

PostgreSQL 12 and later: contents of the postgresql.auto.conf

shared_preload_libraries = 'repmgr,pg_stat_statements,auto_explain'
wal_log_hints = 'on'
....

sahapasci avatar Sep 17 '20 16:09 sahapasci