Explore integration of replication slots for HA with `max_slot_wal_keep_size`
PostgreSQL 13 introduces support for max_slot_wal_keep_size. From the documentation:
Specify the maximum size of WAL files that replication slots are allowed to retain in the pg_wal directory at checkpoint time. If max_slot_wal_keep_size is -1 (the default), replication slots may retain an unlimited amount of WAL files. Otherwise, if restart_lsn of a replication slot falls behind the current LSN by more than the given size, the standby using the slot may no longer be able to continue replication due to removal of required WAL files. You can see the WAL availability of replication slots in pg_replication_slots. If this value is specified without units, it is taken as megabytes. This parameter can only be set in the postgresql.conf file or on the server command line.
As part of the auto-pilot setting, in case of presence of a separate volume for WALs and presence of continuous archiving, we should be able to automatically set this value to prevent disk exhaustion. Shall we provide an option for auto-tuning replication slots? Or simply document the setting?
Ideally, if we provide a threshold based on the volume size (either storage or walStorage if specified), and we automatically configure that value in PostgreSQL. Then when we go over, Postgres invalidates those slots, we disable the feature, wait for slots to be deleted, and reactivate it. Any thoughts?
See also this article from our dear Alvaro Herrera: https://www.2ndquadrant.com/en/blog/pg13-slot-size-limit/