Dropping a database
Hi
In postgresql managed by pg_auto_failover when a database is attempted to drop, the statement never returns and the postgresql log starts getting flooded with messages reporting deadlocks as shown below.
postgres=# create database amod; CREATE DATABASE postgres=# drop database amod;
--> HANGS..
2025-01-24 07:52:24.845 GMT [954540]: [19-1]db=,user=,app=,client= LOG: 00000: started worker for pg_auto_failover health checks in "amod" 2025-01-24 07:52:24.845 GMT [954540]: [20-1]db=,user=,app=,client= LOCATION: HealthCheckWorkerLauncherMain, health_check_worker.c:347
2025-01-24 07:52:25.847 GMT [954591]: [1-1]db=,user=,app=,client= LOG: 00000: process 954591 still waiting for RowExclusiveLock on object 24610 of class 1262 of database 0 after 1000.140 ms 2025-01-24 07:52:25.847 GMT [954591]: [2-1]db=,user=,app=,client= DETAIL: Process holding the lock: 954584. Wait queue: 954591. 2025-01-24 07:52:25.847 GMT [954591]: [3-1]db=,user=,app=,client= LOCATION: ProcSleep, proc.c:1508 2025-01-24 07:52:29.862 GMT [954584]: [1-1]db=postgres,user=postgres,app=psql,client=[local] LOG: 00000: still waiting for backend with PID 954543 to accept ProcSignalBarrier 2025-01-24 07:52:29.862 GMT [954584]: [2-1]db=postgres,user=postgres,app=psql,client=[local] LOCATION: WaitForProcSignalBarrier, procsignal.c:421 2025-01-24 07:52:29.862 GMT [954584]: [3-1]db=postgres,user=postgres,app=psql,client=[local] STATEMENT: drop database amod (force);
Soon the sessions starts accumulating due to deadlock and it causes the database go inresponsive due to too many sessions error. Only way to resolve is to cancel the drop database statement by ctrl+c. This leads to database to be dropped go in invalid state and it is reported continuously in the postgresql log.
2025-01-28 07:47:34.975 GMT [1848700]: [1-1]db=amod,user=pmm,app=[unknown],client=10.8.99.136 FATAL: 55000: cannot connect to invalid database "amod" 2025-01-28 07:47:34.975 GMT [1848700]: [2-1]db=amod,user=pmm,app=[unknown],client=10.8.99.136 HINT: Use DROP DATABASE to drop invalid databases. 2025-01-28 07:47:34.975 GMT [1848700]: [3-1]db=amod,user=pmm,app=[unknown],client=10.8.99.136 LOCATION: InitPostgres, postinit.c:1117
Has anyone seen this behavior? How to circuvent this issue? Any ideas/known best methods sharing would help greatly.
Regards
Amod
Hi @AmodKakade12
I've often observed this behavior when logical replication is in use. Until I drop the subscriptions, creating or dropping a database tends to hang.
It's worth checking if logical replication is enabled or is not in use but still have inactive slots.
Another factor to consider is the checkpoint frequency.
thanks for reply. I dont have any logical replication setup. I could work around the problem by starting the database in single user mode and then drop the database
postgres@~]$ postgres --single postgresbackend> drop database XXX;[postgres@~]$ pg_ctl stop[postgres@***~]$ pg_ctl start
Regards
Amod
+91 98457 13545
On Wednesday, March 5, 2025 at 06:20:27 PM GMT+5:30, Haider Zahid ***@***.***> wrote:
Hi @AmodKakade12
I've often observed this behavior when logical replication is in use. Until I drop the subscriptions, creating or dropping a database tends to hang.
It's worth checking if logical replication is enabled or is not in use but still have inactive slots.
Another factor to consider is the checkpoint frequency.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***> haiderzahid98 left a comment (hapostgres/pg_auto_failover#1063) Hi @AmodKakade12
I've often observed this behavior when logical replication is in use. Until I drop the subscriptions, creating or dropping a database tends to hang.
It's worth checking if logical replication is enabled or is not in use but still have inactive slots.
Another factor to consider is the checkpoint frequency.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
Hi,
@DimCitus @JelteF @mtuncer
Appreciate help from the experts on this limitation. Reopening for further discussion!
Regards
Amod
Hi,
@DimCitus @JelteF @mtuncer
Hi
Further update. I have found only following workaround to solve this
- Stop the systemd daemon for pg_auto_failover service on all replica & primary
- Start the primary in single user mode as follows $ postgres --single postgres
- drop the database using drop database
; command & exit - Restart the systemd daemon for pg_auto_failover service on primary and then all replica.
- Database in question is dropped successfully.
Although this workaround is avaialble it would be good if there is a single pg_auto_failover command or defined process to carry out this operation.
regards
Amod
There must be something configured weirdly, because usually you can drop databases just fine with pg_auto_failover running.
Can you share your setup here maybe, then someone could help possibly.
Thanks for reply. I feel the PGAF healthcheck user keeps connected to the database which causes above issue. I can see following messages appearing on the DB log
Is there any way to halt the healthcheck user someway to complete such one time maintenance jobs?
DB log
2025-09-03 10:39:10.227 BST [3038132]: [261-1]db=postgres,user=postgres,app=psql,client=[local] STATEMENT: drop database amod;
2025-09-03 10:39:15.232 BST [3038132]: [262-1]db=postgres,user=postgres,app=psql,client=[local] LOG: 00000: still waiting for backend with PID 3037451 to accept ProcSignalBarrier
2025-09-03 10:39:15.232 BST [3038132]: [263-1]db=postgres,user=postgres,app=psql,client=[local] LOCATION: WaitForProcSignalBarrier, procsignal.c:415
[postgres@log]$ ps -ef|grep 3037451
postgres 3037451 2184305 0 10:31 ? 00:00:00 postgres: pg_auto_failover monitor healthcheck worker postgres
postgres 3039827 3034152 0 10:40 pts/4 00:00:00 grep --color=auto 3037451
[postgres@log]$
postgres=# SELECT * FROM pg_stat_activity WHERE pid =3037451;
datid | datname | pid | leader_pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start
| query_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query_id | query | backend_type
-------+----------+---------+------------+----------+----------+--------------------------------------+-------------+----------- ------+-------------+-------------------------------+------------
+-------------+--------------+-----------------+------------+-------+-------------+--------------+----------+-------+----------------
5 | postgres | 3037451 | | 10 | postgres | pg_auto_failover health check worker | | | | 2025-09-03 10:31:02.925617+01 |
| | | Client | ClientRead | | | | | | pgautofailover
(1 row)
postgres=# select pid,
usename,
pg_blocking_pids(pid) as blocked_by,
query as blocked_query
from pg_stat_activity where query like '%drop%';
pid | usename | blocked_by | blocked_query
---------+----------+------------+---------------------------------------------------
3038132 | postgres | {} | drop database amod;
3039906 | postgres | {} | select pid, +
2 rows)
postgres=#
Further I have used following commands to create the monitor and database
Monitor
yum -y install pg_auto_failover_17
pg_autoctl create monitor --pgctl /usr/pgsql-17/bin/pg_ctl --pgdata <PGDATA home> --hostname
Database yum -y install pg_auto_failover_17 pg_autoctl create postgres --pgdata <PGDATA home> --name n1 --auth scram-sha-256 --candidate-priority 80 --replication-quorum true --ssl-self-signed require --username postgres --dbname postgres --pgport <PGPORT> --pghost <PrimaryHostName> --hostname <PrimaryHostName> --pgctl /usr/pgsql-17/bin --maximum-backup-rate 100G pg_autoctl run
Does inserting data (and committing) work against a table that you create in the database before?
I had similar behaviours when communication between the worker nodes was not healthy.
Does inserting data (and committing) work against a table that you create in the database before?
I had similar behaviours when communication between the worker nodes was not healthy.
I was able to perforrm all DML against tables in the database as usual. Problem comes only when the drop database command is issued.
@dimitri
Appreciate if users of this great tool can provide their experience on this. This looks like a limitation to the software and requires outage to fix. It would be great if the developers can include the fix/solution to this issue
It seems that dropping database is not an issue anymore as I tested in postgres 18.1 + pg_auto_failover 2.2 on rocky linux 10. But If you drop your database set in .local config and then shutdown your pgautofailover service, and you will find service can't be brought online again. You could solove it through "pg_ctl start" to initiate a local instance, and then create the database.