barman backup completion error when i use rsync and reuse_link
we setup barman backup using rsync and reuse_link to backup incremently. 5 days ago full backup was completed and after 2 days, incremental backup with reuse_links completed. However we can not take a backup in last 2 days that we tried several times manually.
barman list-backup mydb
mydb 20230524T014506 - STARTED
mydb 20230521T044506 - Sun May 21 09:54:48 2023 - Size: 5.1 TiB - Wal Size: 402.1 GiB (tablespaces: repo)
mydb 20230519T014506 - Fri May 19 10:34:48 2023 - Size: 5.1 TiB - Wal Size: 100.2 GiB (tablespaces: repo)
We see that end_offset and end_wal is null however there is no error. barman diagnose output is
barman diagnose.... .... .... }, "20230524T014506": { "backup_id": "20230524T014506", "backup_label": null, "begin_offset": 4012, "begin_time": "2023-05-24T01:45:06.949716+03:00", "begin_wal": "000001BC0000845C000000CB", "begin_xlog": "845C/C8001028", "config_file": "/data/postgresql.conf", "copy_stats":null "deduplicated_size": null, "end_offset": null, "end_time": null, "end_wal": null, "end_xlog": null, "error": null, "hba_file": "/data/pg_hba.conf", "ident_file": "/data/pg_ident.conf", "included_files": [ "/data/postgresql.auto.conf" ], "mode": "rsync-concurrent", "pgdata": "/data", "server_name": "mydb", "size": null, "status": "STARTED", "systemid": "66841250581234512355", "tablespaces": [ [ "repo", 6453213, "/pg_tbl/repo" ] ], "timeline": 222, "version": 120008, "xlog_segment_size": 16777216 } }, "config": { "active": true, "archiver": false, "archiver_batch_size": 0, "backup_directory": "/barman/mydb", "backup_method": "rsync", "backup_options": "concurrent_backup", "bandwidth_limit": null, ... .... .... .. "status": { "archive_command": "cp %p /WAL_archive/%f", "archive_mode": "on", "archive_timeout": 0, "checkpoint_timeout": 900, "config_file": "/data/postgresql.conf", "connection_error": null, "current_archived_wals_per_second": 0.152131251231, "current_lsn": "8461/C14817C0", "current_size": 210798164118.0, "current_xlog": "000001BC00008461000000C1", "data_checksums": "on", "data_directory": "/data", "failed_count": 0, "has_backup_privileges": true, "hba_file": "/data/pg_hba.conf", "hot_standby": "on", "ident_file": "/data/pg_ident.conf", "included_files": [ "/data/postgresql.auto.conf" ], "is_archiving": true, "is_in_recovery": false, "is_superuser": true, .... .... ....
There is no error in barman.log, Sometimes it collects 17 GB (50 GB in another attempt) and there is no rsync processes on servers after a while
Hi @ahmetmelihbasbug - since end_offset and end_wal are null and the status is STARTED we know the backup process terminated before the completion of the backup (if the main backup process had run to completion then the end state would be FAILED).
Some possible reasons which are worth investigating further are:
- The Barman process could be being killed by the OOM killer. Check the output of
dmesgon your Barman host to see if there is any evidence of OOM killer activity. - The backup process might be running with a number of parallel jobs higher than the value of sshd's
MaxStartups(usually this defaults to 10) meaning that some Rsync connections are terminated during the backup. - The Barman process might be running under
nohup- there are some scenarios, e.g. the SSH connection used to create the nohup process timing out - which can cause Barman's worker processes to receive a SIGHUP and terminate.
A couple of related questions:
- How are you running the
barman backupprocess? Is it run under cron, or via some other means? - What arguments are you using in your
barman backupcommand?
@mikewallace1979 Hello - we have .sh script file to run in crontab or we use it command line
#!/bin/bash
/usr/bin/barman cron
/home/barman/deletedfailed.sh
/usr/bin/barman backup mydb -j 10
In the instance, there are pg_dump hourly jobs in crontab, too.
- MaxStartups 100
Hi @ahmetmelihbasbug, when you mention you cannot do a backup you mean that you try to execute barman backup mydb and nothing happens, or that no matter what you do your backup hangs?