[BUG] MySQL PITR for incremental backup doesn't work.
Database name
MySQL
WAL-G Version
master
Describe your problem
Im unable to run PITR with wal-g mysql when the incremental backup was created by using xtrabackup-push and when the binlog-replay is provided with both backup name and timestamp in rfc3339 format.
ERROR 1062 (23000) at line 45: Duplicate entry '1105' for key 'okay.PRIMARY'
ERROR: 2025/08/09 23:53:42.088718 failed to replay mysql-bin.000148: exit status 1
INFO: 2025/08/09 23:53:42.089524 downloading mysql-bin.000151 into /var/lib/mysql/dump/binlog/mysql-bin.000151
ERROR: 2025/08/09 23:53:42.100352 Failed to fetch binlogs: exit status 1
Steps to reproduce
- Setup the tables for testing.
CREATE TABLE IF NOT EXISTS okay ( id BIGINT AUTO_INCREMENT PRIMARY KEY, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL ); - I run the following command everyone minute to fill up the table.
INSERT INTO okay (created_at) VALUES (CURRENT_TIMESTAMP); - Take incremental backups from the source node.
wal-g xtrabackup-push --count-journals --full=false --config walg.env - On the recovery node, run backup fetch and binlog replay.
wal-g backup-fetch --since stream_20250809T230002Z --config walg.env wal-g binlog-replay --since stream_20250809T230002Z --until 2025-08-09T23:30:30Z --config walg.env - The
backup-fetchworks since I can see the data up to when the incremental backup was created. But when thebinlog-replayruns, it gives error causing pitr recovery failed.
Config and WAL-G logs
Here is the walg config:
OSS_ACCESS_KEY_ID=--HIDDEN--
OSS_ACCESS_KEY_SECRET=--HIDDEN--
OSS_ENDPOINT=https://oss-ap-southeast-5-internal.aliyuncs.com
OSS_REGION=ap-southeast-5
OSS_ROLE_ARN=[redacted]
OSS_ROLE_SESSION_NAME=test-hz-ha-my-1-ot-01
TOTAL_BG_UPLOADED_LIMIT=32
WALG_BACKUP_DOWNLOAD_MAX_RETRY=1
WALG_COMPRESSION_METHOD=lz4
WALG_DELTA_MAX_STEPS=2
WALG_DIRECT_IO=false
WALG_DOWNLOAD_CONCURRENCY=32
WALG_DOWNLOAD_FILE_RETRIES=15
WALG_ENVELOPE_CACHE_EXPIRATION=0
WALG_FAILOVER_STORAGES_CACHE_LIFETIME=15m
WALG_FAILOVER_STORAGES_CHECK_TIMEOUT=30s
WALG_INTEGRITY_MAX_DELAYED_WALS=0
WALG_LIBSODIUM_KEY_TRANSFORM=none
WALG_LOG_LEVEL=DEVEL
WALG_MYSQL_BACKUP_PREPARE_COMMAND=xtrabackup --prepare --target-dir=/var/lib/mysql
WALG_MYSQL_BINLOG_DST=/var/lib/mysql/dump/binlog
WALG_MYSQL_BINLOG_REPLAY_COMMAND=mysqlbinlog --stop-datetime="$WALG_MYSQL_BINLOG_END_TS" "$WALG_MYSQL_CURRENT_BINLOG" | mysql -u "***" -p"***"
WALG_MYSQL_DATASOURCE_NAME=***:****@tcp(127.0.0.1:3306)/
WALG_MYSQL_INCREMENTAL_BACKUP_DST=/tmp
WALG_OSS_PREFIX=oss://test-bucket/test
WALG_PREVENT_WAL_OVERWRITE=false
WALG_SERIALIZER_TYPE=json_default
WALG_SKIP_REDUNDANT_TARS=false
WALG_STORE_ALL_CORRUPT_BLOCKS=false
WALG_STREAM_CREATE_COMMAND=xtrabackup --backup --stream=xbstream --datadir=/var/lib/mysql --user=*** --password=***
WALG_STREAM_RESTORE_COMMAND=xbstream -x -C /var/lib/mysql
WALG_STREAM_SPLITTER_BLOCK_SIZE=1048576
WALG_TAR_DISABLE_FSYNC=false
WALG_TAR_SIZE_THRESHOLD=1073741823
WALG_UPLOAD_CONCURRENCY=32
WALG_UPLOAD_DISK_CONCURRENCY=10
WALG_UPLOAD_QUEUE=2
WALG_UPLOAD_WAL_METADATA=NOMETADATA
WALG_USE_COPY_COMPOSER=false
WALG_USE_DATABASE_COMPOSER=false
WALG_USE_RATING_COMPOSER=false
WALG_USE_REVERSE_UNPACK=false
WALG_USE_WAL_DELTA=false
WALG_VERIFY_PAGE_CHECKSUMS=false
WALG_WITHOUT_FILES_METADATA=false
Here is the log from xtrabackup-push log for the backup I used on the command above:
INFO: 2025/08/09 23:00:02.581812 LATEST backup is: 'stream_20250809T220002Z'
INFO: 2025/08/09 23:00:02.607523 Delta backup from stream_20250809T220002Z with LSN 21286423.
INFO: 2025/08/09 23:00:02.607596 Command to execute: /bin/bash -c xtrabackup --backup --stream=xbstream --datadir=/var/lib/mysql --user=**** --password=******* --extra-lsndir=/tmp/wal-g2372090394 --incremental-lsn=21286423
INFO: 2025/08/09 23:00:05.424289 FILE PATH: stream_20250809T230002Z/stream.lz4
INFO: 2025/08/09 23:00:05.433113 Backup sentinel: {"Tool":"WALG_XTRABACKUP_TOOL","BinLogStart":"mysql-bin.000148","BinLogEnd":"mysql-bin.000149","StartLocalTime":"2025-08-09T23:00:02.572551Z","StopLocalTime":"2025-08-09T23:00:05.433088Z","UncompressedSize":6846673,"CompressedSize":211454,"Hostname":"al-sy-id-s-scp-test-walg-test-my-02","ServerUUID":"0ee368ba-74c9-11f0-be10-00163e05996e","ServerVersion":"8.0.42-0ubuntu0.22.04.2","ServerArch":"amd64","ServerOS":"linux","IsPermanent":false,"IsIncremental":true,"LSN":21363555,"DeltaLSN":21286423,"DeltaFrom":"stream_20250809T220002Z","DeltaFullName":"stream_20250809T220002Z","DeltaCount":1}
INFO: 2025/08/09 23:00:05.510115 uploaded journal info for stream_20250809T230002Z
Here is the binlog-replay log from the command I used above.
Replaying binglog starting from stream_20250809T230002Z until 2025-08-09T23:30:30Z
WARNING: 2025/08/09 23:53:41.789553 WALG_FAILOVER_STORAGES_CHECK_TIMEOUT is unknown
WARNING: 2025/08/09 23:53:41.789580 WALG_FAILOVER_STORAGES_CACHE_LIFETIME is unknown
WARNING: 2025/08/09 23:53:41.789598 We found that some variables in your config file detected as 'Unknown'.
If this is not right, please create issue https://github.com/wal-g/wal-g/issues/new
INFO: 2025/08/09 23:53:42.001991 Backup sentinel: {"Tool":"WALG_XTRABACKUP_TOOL","BinLogStart":"mysql-bin.000148","BinLogEnd":"mysql-bin.000149","StartLocalTime":"2025-08-09T23:00:02.572551Z","StopLocalTime":"2025-08-09T23:00:05.433088Z","UncompressedSize":6846673,"CompressedSize":211454,"Hostname":"al-sy-id-s-scp-test-walg-test-my-02","ServerUUID":"0ee368ba-74c9-11f0-be10-00163e05996e","ServerVersion":"8.0.42-0ubuntu0.22.04.2","ServerArch":"amd64","ServerOS":"linux","IsPermanent":false,"IsIncremental":true,"LSN":21363555,"DeltaLSN":21286423,"DeltaFrom":"stream_20250809T220002Z","DeltaFullName":"stream_20250809T220002Z","DeltaCount":1}
INFO: 2025/08/09 23:53:42.015320 Backup sentinel file: stream_20250809T230002Z_backup_stop_sentinel.json (2025-08-09 23:00:05 +0000 UTC)
INFO: 2025/08/09 23:53:42.021294 Backup start binlog: mysql-bin.000148.lz4 (2025-08-09 22:50:01 +0000 UTC)
INFO: 2025/08/09 23:53:42.021318 Fetching binlogs since 2025-08-09 22:50:01 +0000 UTC until 2025-08-09 23:30:30 +0000 UTC
INFO: 2025/08/09 23:53:42.026602 downloading mysql-bin.000148 into /var/lib/mysql/dump/binlog/mysql-bin.000148
INFO: 2025/08/09 23:53:42.058668 downloading mysql-bin.000149 into /var/lib/mysql/dump/binlog/mysql-bin.000149
INFO: 2025/08/09 23:53:42.058749 replaying mysql-bin.000148 ...
mysql: [Warning] Using a password on the command line interface can be insecure.
INFO: 2025/08/09 23:53:42.083465 downloading mysql-bin.000150 into /var/lib/mysql/dump/binlog/mysql-bin.000150
ERROR 1062 (23000) at line 45: Duplicate entry '1105' for key 'okay.PRIMARY'
ERROR: 2025/08/09 23:53:42.088718 failed to replay mysql-bin.000148: exit status 1
INFO: 2025/08/09 23:53:42.089524 downloading mysql-bin.000151 into /var/lib/mysql/dump/binlog/mysql-bin.000151
ERROR: 2025/08/09 23:53:42.100352 Failed to fetch binlogs: exit status 1
Found the issue: looks like I can only use full backup as a start for PITR. Unfortunately in mysql it the backup name for incremental backup has no additional _D_ suffix to mark the delta backup like postgresql. So it is quite hard to check which one is full backup unless we check into the generated sentinel.
Im closing this issue. Thanks!
Im reopening this issue since Im seeing the same issue again. Looks like my assumption earlier about not using delta backup was wrong since i still get the same error even when im using full backup.
@ostinru any pointer on this?
I am quite busy these days. Will try to reproduce issue tomorrow.
As I can see - you asking wal-g to restore following combo:
- stream_20250809T220002Z (base backup)
- stream_20250809T230002Z (incremental backup) [OPTIONAL]
- replay binlogs
And all performed steps sounds correct.
In order to reduce scope of scenarios to reproduce, @imrenagi, could you provide following information:
- do you use GTIDs? (
select @@gtid_mode;)
No @ostinru . it is OFF.
Ok. Re-read source code... currently wal-g requires GTIDs to be enabled - that is how MySQL will skip events it already seen.
Assumption that we have GTID = ON was made ages ago, and greatly simplifies all backup/restore logic by moving complex logic to MySQL itself.
Alternative approach (without GTIDs) that wal-g is not supporting - is to keep track of binlog-name and binlog-position. And run mysqlbinlog --start-position=$WALG_MYSQL_CURRENT_BINLOG_POSITION "$WALG_MYSQL_CURRENT_BINLOG". However I think it may be fragile for multi-host clusters where multiple switchover/failovers occur.
Thanks for confirming @ostinru . Currently we found workaround by checking the content of xtrabackup_binlog_info file in /var/lib/mysql to get information whether we need to use --start-position argument or not.
The logic is:
- if the
WALG_MYSQL_CURRENT_BINLOGsimilar with binlog mentioned on thextrabackup_binlog_info, then we use--start-positionargument. - If
WALG_MYSQL_CURRENT_BINLOGis later than binlog mentioned on thextrabackup_binlog_infothen we dont use--start-positionargument. - Otherwise, do nothing so that that binlog is skipped and not replayed.
just wondering whether this logic can be implemented in wal-g itself or not since we currently put the logic above on a script which will be used by WALG_MYSQL_BINLOG_REPLAY_COMMAND. If not, what would be the consideration/concerns that we need to be aware of? I was thinking if this is something that can be implemented on wal-g, but after looking at the code, as you said, only GTID = ON assumption is being check.
Another thing that Im not sure is whether it is okay to read the xtrabackup_binlog_info on walg code because AFAIK wal-g only uses data/files stored on the remote storage only (e.g. for reading backup sentinel data, etc) to decide what to do next.
Another thing that Im not sure is whether it is okay to read the
xtrabackup_binlog_infoon walg code because AFAIK wal-g only uses data/files stored on the remote storage only (e.g. for reading backup sentinel data, etc) to decide what to do next.
Yes. You shouldn't rely on files that you will not be able to access during disaster recovery. It seems to be good idea to add binlog_name and binlog_position to backup metadata and then pass as $WALG_MYSQL_CURRENT_BINLOG_POSITION env variable.
Research: How to extract binlog info...
It seems that we can get binlog position from --extra-lsndir's xtrabackup_info file. It looks like it should contain line:
binlog_pos = filename '%filename%', position '%position%'[, GTID of the last change '%gtid_executed%']
TODO: check actual file content for supported MySQL versions.
just wondering whether this logic can be implemented in wal-g itself or not
It sounds that it is possible to add this logic to wal-g. There will no regressions here for GTID-users.
If not, what would be the consideration/concerns that we need to be aware of?
The hardest part with PiTR is to to guarantee that it will work in presence of switchover/failovers. When new master promoted, wal-g should upload:
- relay logs (to cover cases when old master failed to upload part of its binlogs)
- binary logs
And here it became complicated. After failover we may have original binlog + relaylog in S3, both containing same events. And... I don't know how to point binlog_name + binlog_position in a way we don't replay single event twice. In theory, each binlog ends with ROTATE event that tells which file replay next. Obviously, old master will point 'next' to its binlog (that will not be uploaded) and in fact we should find relaylog as 'next'.
BTW, @imrenagi , do you use any tools to orchestrate HA MySQL?
Thanks for the detailed explanation @ostinru .
BTW, do you use any tools to orchestrate HA MySQL?
No, we are not using any yet. Currently it is just standard master replica setup and we run the xtrabackup-push and binlog-push command from the replica.
The hardest part with PiTR is to to guarantee that it will work in presence of switchover/failovers.
True. In our case, I was thinking before that in the case when we need to promote the replica to a new master, it would be easier for us to just setup new replica from newly promoted master and then "recreate" new walg setup from the replica itself so that we dont need to worry about which binlog the old setup pointing to.
It seems to be good idea to add binlog_name and binlog_position to backup metadata
Sure, let me try to locate and understand the code for this first.
It seems to be good idea to add binlog_name and binlog_position to backup metadata
Isn't binlog_name already stored in backup metadata? The name is BinLogStart. We only need to add the binlog position and can ignore the value if we enabled GTID.
Research: How to extract binlog info...
can we use SHOW MASTER STATUS?
show master status\G;
*************************** 1. row ***************************
File: mysql-bin.001529
Position: 454
it doesn't depend on xtrabackup, and also works without GTID.
Assumption that we have GTID = ON was made ages ago
Looks like this bug is related to GTID disabled. @imrenagi mentioned that his GTID is off. I managed to recreate the issue by simply disabling the GTID.
I tried to query the current binlog using SHOW MASTER STATUS while triggering the wal-g xtrabackup-push command at the same time. Both shows different value (wal-g value is older).
- mysql (
mysql-bin.001529,454)
show master status\G;
*************************** 1. row ***************************
File: mysql-bin.001529
Position: 454
- wal-g (
mysql-bin.001527,no position stored in wal-g)
INFO: 2025/08/18 03:48:36.420510 Backup sentinel: {"Tool":"WALG_XTRABACKUP_TOOL","BinLogStart":"mysql-bin.001527","BinLogEnd":"mysql-bin.001527","StartLocalTime":"2025-08-18T03:48:33.57217Z","StopLocalTime":"2025-08-18T03:48:36.420481Z","UncompressedSize":5765329,"CompressedSize":196386,"Hostname":"******","ServerUUID":"*******","ServerVersion":"8.0.42-0ubuntu0.22.04.2","ServerArch":"amd64","ServerOS":"linux","IsPermanent":false,"IsIncremental":true,"LSN":36715682,"DeltaLSN":36655179,"DeltaFrom":"stream_20250818T030002Z","DeltaFullName":"stream_20250818T020002Z","DeltaCount":2}
After checking the code, it might be related to empty struct on gtidStart value (because gtid is disabled). Then it is compared with another empty struct on prevGtid, which always returns true. It makes getLastUploadedBinlogBeforeGTID always returns the first logFiles it sees.
Isn't
binlog_namealready stored in backup metadata? The name isBinLogStart.
No. You should trust xtrabackup in this case. wal-g can select positions before/after backup. However only xtrabackup knows at which point of database history it created snapshot.
Assumption that we have GTID = ON was made ages ago
Looks like this bug is related to GTID disabled.
I would agree that this is a documentation bug. We should clearly describe how wal-g works, what prerequisites it expects, etc. Working without GITDs will also require some prerequisites e.g. 'only xtrabackup' supported, some limitations on binlog/relaylog file names(?), etc
Isn't binlog_name already stored in backup metadata? The name is BinLogStart.
No. You should trust xtrabackup in this case.
yes. I was talking about the SentinelDto created by wal-g. It already has BinLogStart
I would agree that this is a documentation bug. We should clearly describe how wal-g works, what prerequisites it expects, etc.
Yes. would work. We could start another thread if we want to support gtid disabled, though.
Yes. would work. We could start another thread if we want to support gtid disabled, though.
Is there any reason not to enable GTIDs?
Is there any reason not to enable GTIDs?
TBH Im not sure. The existing setup we have already have GTID disabled. We haven't really look into what needs to be done and what would be the impact if we enable the GTID on the running mysql.
@JonathanGun you probably want to contribute to add the case for GTID disabled to the mysql documentation?