tiflow icon indicating copy to clipboard operation
tiflow copied to clipboard

DM: MariaDB 11.8 DPANIC due to GTID set update failure

Open dveeden opened this issue 1 month ago • 11 comments

What did you do?

Source: MariaDB 11.8.5

podman run --rm --name mariadb -e MARIADB_ALLOW_EMPTY_ROOT_PASSWORD=1 -p 3306:3306 mariadb:11.8 --log-bin=mariadb-log --binlog-format=ROW --binlog-legacy-event-pos

Source

source-id: "mariadb-01"
from:
  host: "127.0.0.1"
  port: 3306
  user: "root"

Task

name: mariadb-to-tidb
task-mode: all
mysql-instances:
  - source-id: "mariadb-01"
    block-allow-list: "test"
target-database:
  host: "127.0.0.1"
  port: 4000
  user: "root"
block-allow-list:
  test:
    do-dbs: ["*"]

Statements on source:

create schema test;
use test
create table t (id int primary key);
insert into t values (1);
alter table t add column c1 varchar(255);
update t set c1='test 123';

What did you expect to see?

Regular working replication

What did you see instead?

[2025/11/27 08:57:14.191 +01:00] [DPANIC] [binlog_locations.go:160] ["failed to update GTID set"] [GTID=0-1-5] [error="invalid GTID format, must UUID:interval[:interval]"] [errorVerbose="invalid GTID format, must UUID:interval[:interval]\ngithub.com/go-mysql-org/go-mysql/mysql.ParseUUIDSet\n\tgithub.com/go-mysql-org/[email protected]/mysql/mysql_gtid.go:214\ngithub.com/go-mysql-org/go-mysql/mysql.ParseMysqlGTIDSet\n\tgithub.com/go-mysql-org/[email protected]/mysql/mysql_gtid.go:411\ngithub.com/go-mysql-org/go-mysql/mysql.(*MysqlGTIDSet).Update\n\tgithub.com/go-mysql-org/[email protected]/mysql/mysql_gtid.go:473\ngithub.com/pingcap/tiflow/dm/syncer/binlogstream.(*locationRecorder).setCurEndGTID\n\tgithub.com/pingcap/tiflow/dm/syncer/binlogstream/binlog_locations.go:158\ngithub.com/pingcap/tiflow/dm/syncer/binlogstream.(*locationRecorder).update.func2\n\tgithub.com/pingcap/tiflow/dm/syncer/binlogstream/binlog_locations.go:228\ngithub.com/pingcap/tiflow/dm/syncer/binlogstream.(*locationRecorder).update\n\tgithub.com/pingcap/tiflow/dm/syncer/binlogstream/binlog_locations.go:180\ngithub.com/pingcap/tiflow/dm/syncer/binlogstream.locationStream.GetEvent\n\tgithub.com/pingcap/tiflow/dm/syncer/binlogstream/streamer_controller.go:123\ngithub.com/pingcap/tiflow/dm/syncer/binlogstream.(*StreamerController).getEvent\n\tgithub.com/pingcap/tiflow/dm/syncer/binlogstream/streamer_controller.go:533\ngithub.com/pingcap/tiflow/dm/syncer/binlogstream.(*StreamerController).GetEvent\n\tgithub.com/pingcap/tiflow/dm/syncer/binlogstream/streamer_controller.go:337\ngithub.com/pingcap/tiflow/dm/syncer.(*Syncer).Run\n\tgithub.com/pingcap/tiflow/dm/syncer/syncer.go:2153\ngithub.com/pingcap/tiflow/dm/syncer.(*Syncer).Process\n\tgithub.com/pingcap/tiflow/dm/syncer/syncer.go:757\nruntime.goexit\n\truntime/asm_amd64.s:1693"] [stack="github.com/pingcap/tiflow/dm/syncer/binlogstream.(*locationRecorder).setCurEndGTID\n\tgithub.com/pingcap/tiflow/dm/syncer/binlogstream/binlog_locations.go:160\ngithub.com/pingcap/tiflow/dm/syncer/binlogstream.(*locationRecorder).update.func2\n\tgithub.com/pingcap/tiflow/dm/syncer/binlogstream/binlog_locations.go:228\ngithub.com/pingcap/tiflow/dm/syncer/binlogstream.(*locationRecorder).update\n\tgithub.com/pingcap/tiflow/dm/syncer/binlogstream/binlog_locations.go:180\ngithub.com/pingcap/tiflow/dm/syncer/binlogstream.locationStream.GetEvent\n\tgithub.com/pingcap/tiflow/dm/syncer/binlogstream/streamer_controller.go:123\ngithub.com/pingcap/tiflow/dm/syncer/binlogstream.(*StreamerController).getEvent\n\tgithub.com/pingcap/tiflow/dm/syncer/binlogstream/streamer_controller.go:533\ngithub.com/pingcap/tiflow/dm/syncer/binlogstream.(*StreamerController).GetEvent\n\tgithub.com/pingcap/tiflow/dm/syncer/binlogstream/streamer_controller.go:337\ngithub.com/pingcap/tiflow/dm/syncer.(*Syncer).Run\n\tgithub.com/pingcap/tiflow/dm/syncer/syncer.go:2153\ngithub.com/pingcap/tiflow/dm/syncer.(*Syncer).Process\n\tgithub.com/pingcap/tiflow/dm/syncer/syncer.go:757"]
invalid GTID format, must UUID:interval[:interval]
github.com/go-mysql-org/go-mysql/mysql.ParseUUIDSet
	github.com/go-mysql-org/[email protected]/mysql/mysql_gtid.go:214
github.com/go-mysql-org/go-mysql/mysql.ParseMysqlGTIDSet
	github.com/go-mysql-org/[email protected]/mysql/mysql_gtid.go:411
github.com/go-mysql-org/go-mysql/mysql.(*MysqlGTIDSet).Update
	github.com/go-mysql-org/[email protected]/mysql/mysql_gtid.go:473
github.com/pingcap/tiflow/dm/syncer/binlogstream.(*locationRecorder).setCurEndGTID
	github.com/pingcap/tiflow/dm/syncer/binlogstream/binlog_locations.go:158
github.com/pingcap/tiflow/dm/syncer/binlogstream.(*locationRecorder).update.func2
	github.com/pingcap/tiflow/dm/syncer/binlogstream/binlog_locations.go:228
github.com/pingcap/tiflow/dm/syncer/binlogstream.(*locationRecorder).update
	github.com/pingcap/tiflow/dm/syncer/binlogstream/binlog_locations.go:180
github.com/pingcap/tiflow/dm/syncer/binlogstream.locationStream.GetEvent
	github.com/pingcap/tiflow/dm/syncer/binlogstream/streamer_controller.go:123
github.com/pingcap/tiflow/dm/syncer/binlogstream.(*StreamerController).getEvent
	github.com/pingcap/tiflow/dm/syncer/binlogstream/streamer_controller.go:533
github.com/pingcap/tiflow/dm/syncer/binlogstream.(*StreamerController).GetEvent
	github.com/pingcap/tiflow/dm/syncer/binlogstream/streamer_controller.go:337
github.com/pingcap/tiflow/dm/syncer.(*Syncer).Run
	github.com/pingcap/tiflow/dm/syncer/syncer.go:2153
github.com/pingcap/tiflow/dm/syncer.(*Syncer).Process
	github.com/pingcap/tiflow/dm/syncer/syncer.go:757
runtime.goexit
	runtime/asm_amd64.s:1693

Versions of the cluster

DM version (run dmctl -V or dm-worker -V or dm-master -V):

Release Version: v9.0.0-beta.2.pre-43-gd105a2b4b
Git Commit Hash: d105a2b4b780a1d4b6dd9bbf6f43bd7af081efdf
Git Branch: master
UTC Build Time: 2025-11-27 07:50:16
Go Version: go1.25.4 X:nodwarf5
Failpoint Build: false

Upstream MySQL/MariaDB server version:

11.8.5

Downstream TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

v8.5.3

How did you deploy DM: tiup or manually?

(leave TiUP or manually here)

Other interesting information (system version, hardware config, etc):

>
>

current status of DM cluster (execute query-status <task-name> in dmctl)

» query-status mariadb-to-tidb
{
    "result": true,
    "msg": "",
    "sources": [
        {
            "result": true,
            "msg": "",
            "sourceStatus": {
                "source": "mariadb-01",
                "worker": "127.0.0.1:8262",
                "result": null,
                "relayStatus": null
            },
            "subTaskStatus": [
                {
                    "name": "mariadb-to-tidb",
                    "stage": "Running",
                    "unit": "Sync",
                    "result": null,
                    "unresolvedDDLLockID": "",
                    "sync": {
                        "totalEvents": "8",
                        "totalTps": "0",
                        "recentTps": "0",
                        "masterBinlog": "(mariadb-log.000002, 610)",
                        "masterBinlogGtid": "0-1-2",
                        "syncerBinlog": "(mariadb-log.000002, 610)",
                        "syncerBinlogGtid": "",
                        "blockingDDLs": [
                        ],
                        "unresolvedGroups": [
                        ],
                        "synced": true,
                        "binlogType": "remote",
                        "secondsBehindMaster": "0",
                        "blockDDLOwner": "",
                        "conflictMsg": "",
                        "totalRows": "8",
                        "totalRps": "0",
                        "recentRps": "0",
                        "ioTotalBytes": "1375503",
                        "dumpIOTotalBytes": "2621"
                    },
                    "validation": null
                }
            ]
        }
    ]
}

dveeden avatar Nov 27 '25 08:11 dveeden

/label mariadb

dveeden avatar Nov 27 '25 08:11 dveeden

mysql-11.8.5-MariaDB-ubu2404-log [test]> show binlog events in 'mariadb-log.000002';
+--------------------+------+-------------------+-----------+-------------+-------------------------------------------------------+
| Log_name           | Pos  | Event_type        | Server_id | End_log_pos | Info                                                  |
+--------------------+------+-------------------+-----------+-------------+-------------------------------------------------------+
| mariadb-log.000002 |    4 | Format_desc       |         1 |         256 | Server ver: 11.8.5-MariaDB-ubu2404-log, Binlog ver: 4 |
| mariadb-log.000002 |  256 | Gtid_list         |         1 |         285 | []                                                    |
| mariadb-log.000002 |  285 | Binlog_checkpoint |         1 |         330 | mariadb-log.000002                                    |
| mariadb-log.000002 |  330 | Gtid              |         1 |         372 | GTID 0-1-1                                            |
| mariadb-log.000002 |  372 | Query             |         1 |         457 | create schema test                                    |
| mariadb-log.000002 |  457 | Gtid              |         1 |         499 | GTID 0-1-2                                            |
| mariadb-log.000002 |  499 | Query             |         1 |         610 | use `test`; create table t (id int primary key)       |
| mariadb-log.000002 |  610 | Gtid              |         1 |         652 | BEGIN GTID 0-1-3                                      |
| mariadb-log.000002 |  652 | Annotate_rows     |         1 |         699 | insert into t values (1)                              |
| mariadb-log.000002 |  699 | Table_map         |         1 |         743 | table_id: 232 (test.t)                                |
| mariadb-log.000002 |  743 | Write_rows_v1     |         1 |         781 | table_id: 232 flags: STMT_END_F                       |
| mariadb-log.000002 |  781 | Xid               |         1 |         812 | COMMIT /* xid=1892 */                                 |
| mariadb-log.000002 |  812 | Gtid              |         1 |         854 | GTID 0-1-4                                            |
| mariadb-log.000002 |  854 | Query             |         1 |         970 | use `test`; alter table t add column c1 varchar(255)  |
| mariadb-log.000002 |  970 | Gtid              |         1 |        1012 | BEGIN GTID 0-1-5                                      |
| mariadb-log.000002 | 1012 | Annotate_rows     |         1 |        1061 | update t set c1='test 123'                            |
| mariadb-log.000002 | 1061 | Table_map         |         1 |        1108 | table_id: 233 (test.t)                                |
| mariadb-log.000002 | 1108 | Update_rows_v1    |         1 |        1162 | table_id: 233 flags: STMT_END_F                       |
| mariadb-log.000002 | 1162 | Xid               |         1 |        1193 | COMMIT /* xid=1894 */                                 |
+--------------------+------+-------------------+-----------+-------------+-------------------------------------------------------+
19 rows in set (0.001 sec)

Note that 0-1-5 is the GTID that was mentioned in the DPANIC

dveeden avatar Nov 27 '25 08:11 dveeden

> [Breakpoint 1] github.com/pingcap/tiflow/dm/pkg/gtid.ParserGTID() ./dm/pkg/gtid/gtid.go:35 (hits goroutine(2139):1 total:8) (PC: 0x888ef76)
    30:		return gSet == nil || gSet.Equal(emptyMySQLGTIDSet) || gSet.Equal(emptyMariaDBGTIDSet)
    31:	}
    32:	
    33:	// ParserGTID parses GTID from string. If the flavor is not specified, it will
    34:	// try mysql GTID first and then MariaDB GTID.
=>  35:	func ParserGTID(flavor, gtidStr string) (mysql.GTIDSet, error) {
    36:		var (
    37:			err  error
    38:			gtid mysql.GTIDSet
    39:		)
    40:	
(dlv) p flavor
""
(dlv) p gtidStr
""
(dlv) bt
0  0x000000000888ef76 in github.com/pingcap/tiflow/dm/pkg/gtid.ParserGTID
   at ./dm/pkg/gtid/gtid.go:35
1  0x0000000008b545cc in github.com/pingcap/tiflow/dm/pkg/dumpling.ParseMetaDataByReader
   at ./dm/pkg/dumpling/utils.go:159
2  0x0000000008b53fd6 in github.com/pingcap/tiflow/dm/pkg/dumpling.ParseMetaData
   at ./dm/pkg/dumpling/utils.go:58
3  0x0000000008c34e2b in github.com/pingcap/tiflow/dm/loader.getMydumpMetadata
   at ./dm/loader/util.go:55
4  0x0000000008c323fd in github.com/pingcap/tiflow/dm/loader.(*LightningLoader).Process
   at ./dm/loader/lightning.go:555
5  0x0000000009050162 in github.com/pingcap/tiflow/dm/worker.(*SubTask).run.gowrap2
   at ./dm/worker/subtask.go:244
6  0x000000000280e541 in runtime.goexit
   at /usr/lib/golang/src/runtime/asm_amd64.s:1693

So looks like ParserGTID() sometimes gets called with an empty flavor and gtidStr, which results in a MySQL GTID instead of the MariaDB GTID that we need.

https://github.com/pingcap/tiflow/blob/d105a2b4b780a1d4b6dd9bbf6f43bd7af081efdf/dm/pkg/gtid/gtid.go#L41-L44

dveeden avatar Nov 27 '25 08:11 dveeden

1  0x0000000008b545cc in github.com/pingcap/tiflow/dm/pkg/dumpling.ParseMetaDataByReader
   at ./dm/pkg/dumpling/utils.go:159

Looks like this parses the following file (dumped_data.mariadb-to-tidb/metadata):

Started dump at: 2025-11-27 09:27:28
SHOW BINARY LOG STATUS:
	Log: mariadb-log.000002
	Pos: 330
	GTID:

Finished dump at: 2025-11-27 09:27:28

Note that this does not include a GTID.

mysql-11.8.5-MariaDB-ubu2404-log [test]> SHOW MASTER STATUS;
+--------------------+----------+--------------+------------------+
| File               | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+--------------------+----------+--------------+------------------+
| mariadb-log.000002 |     1567 |              |                  |
+--------------------+----------+--------------+------------------+
1 row in set (0.000 sec)

mysql-11.8.5-MariaDB-ubu2404-log [test]> SHOW GLOBAL VARIABLES LIKE 'gtid\_%\_pos';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| gtid_binlog_pos  | 0-1-7 |
| gtid_current_pos | 0-1-7 |
| gtid_slave_pos   |       |
+------------------+-------+
3 rows in set (0.002 sec)

The SHOW MASTER STATUS for MariaDB does not include the GTID like it does for MySQL. There are however variables that provide this info.

dveeden avatar Nov 27 '25 08:11 dveeden

Some possible solutions

  • Update the dumpling metadata to include the MariaDB GTID
  • Update the dumpling metadata to include the database flavor (e.g. include the version string (11.8.5-MariaDB-ubu2404-log))
  • Get the right flavor elsewhere (e.g. target instance Version)

Disabling gtid in the source config might be a workaround, but I haven't tried this yet.

dveeden avatar Nov 27 '25 09:11 dveeden

Looks like dumpling is supposed to run SELECT @@global.gtid_binlog_pos...

dveeden avatar Nov 27 '25 09:11 dveeden

Running dumpling directly does give me the GTID:

dvaneeden@dve-carbon:~/dev/pingcap/tidb$ ./bin/dumpling -o /tmp/d1 -h 127.0.0.1 -P 3306 -u root
Release version: v9.0.0-beta.2.pre-817-g594203d587
Git commit hash: 594203d587bf7069d6d551c55e270391cca78083
Git branch:      master
Build timestamp: 2025-11-27 09:05:54Z
Go version:      go version go1.25.4 X:nodwarf5 linux/amd64

[2025/11/27 10:13:10.307 +01:00] [INFO] [versions.go:54] ["Welcome to dumpling"] ["Release Version"=v9.0.0-beta.2.pre-817-g594203d587] ["Git Commit Hash"=594203d587bf7069d6d551c55e270391cca78083] ["Git Branch"=master] ["Build timestamp"="2025-11-27 09:05:54"] ["Go Version"="go version go1.25.4 X:nodwarf5 linux/amd64"]
[2025/11/27 10:13:10.312 +01:00] [WARN] [version.go:334] ["select tidb_version() failed, will fallback to 'select version();'"] [error="Error 1305 (42000): FUNCTION tidb_version does not exist"]
[2025/11/27 10:13:10.312 +01:00] [INFO] [version.go:441] ["detect server version"] [type=MariaDB] [version=11.8.5-MariaDB-ubu2404-log]
[2025/11/27 10:13:10.325 +01:00] [INFO] [dump.go:153] ["begin to run Dump"] [conf="{\"s3\":{\"endpoint\":\"\",\"region\":\"\",\"storage-class\":\"\",\"sse\":\"\",\"sse-kms-key-id\":\"\",\"acl\":\"\",\"access-key\":\"\",\"secret-access-key\":\"\",\"session-token\":\"\",\"provider\":\"\",\"force-path-style\":true,\"use-accelerate-endpoint\":false,\"role-arn\":\"\",\"external-id\":\"\",\"profile\":\"\",\"object-lock-enabled\":false},\"gcs\":{\"endpoint\":\"\",\"storage-class\":\"\",\"predefined-acl\":\"\",\"credentials-file\":\"\"},\"azblob\":{\"endpoint\":\"\",\"account-name\":\"\",\"account-key\":\"\",\"access-tier\":\"\",\"sas-token\":\"\",\"encryption-scope\":\"\",\"encryption-key\":\"\"},\"SpecifiedTables\":false,\"AllowCleartextPasswords\":false,\"SortByPk\":true,\"NoViews\":true,\"NoSequences\":true,\"NoHeader\":false,\"NoSchemas\":false,\"NoData\":false,\"CompleteInsert\":false,\"TransactionalConsistency\":true,\"EscapeBackslash\":true,\"DumpEmptyDatabase\":true,\"PosAfterConnect\":false,\"CompressType\":0,\"Host\":\"127.0.0.1\",\"Port\":3306,\"Threads\":4,\"User\":\"root\",\"Security\":{\"CAPath\":\"\",\"CertPath\":\"\",\"KeyPath\":\"\"},\"LogLevel\":\"info\",\"LogFile\":\"\",\"LogFormat\":\"text\",\"OutputDirPath\":\"/tmp/d1\",\"StatusAddr\":\":8281\",\"Snapshot\":\"\",\"Consistency\":\"flush\",\"CsvNullValue\":\"\\\\N\",\"SQL\":\"\",\"CsvSeparator\":\",\",\"CsvDelimiter\":\"\\\"\",\"CsvLineTerminator\":\"\\r\\n\",\"Databases\":[],\"Where\":\"\",\"FileType\":\"sql\",\"ServerInfo\":{\"ServerType\":2,\"ServerVersion\":\"11.8.5-MariaDB-ubu2404-log\",\"HasTiKV\":false},\"Rows\":0,\"ReadTimeout\":900000000000,\"TiDBMemQuotaQuery\":0,\"FileSize\":0,\"StatementSize\":1000000,\"SessionParams\":{},\"Tables\":{},\"CollationCompatible\":\"loose\",\"CsvOutputDialect\":0,\"IOTotalBytes\":null,\"Net\":\"\"}"]
[2025/11/27 10:13:10.385 +01:00] [INFO] [dump.go:277] ["All the dumping transactions have started. Start to unlock tables"]
[2025/11/27 10:13:10.388 +01:00] [INFO] [collector.go:264] ["backup success summary"] [total-ranges=3] [ranges-succeed=3] [ranges-failed=0] [total-take=2.848976ms] [total-kv-size=115B] [average-speed=40.37kB/s] [total-rows=1]
[2025/11/27 10:13:10.388 +01:00] [INFO] [main.go:82] ["dump data successfully, dumpling will exit now"]
dvaneeden@dve-carbon:~/dev/pingcap/tidb$ cat /tmp/d1/metadata
Started dump at: 2025-11-27 10:13:10
SHOW BINARY LOG STATUS:
	Log: mariadb-log.000002
	Pos: 1567
	GTID:0-1-7

Finished dump at: 2025-11-27 10:13:10

dveeden avatar Nov 27 '25 09:11 dveeden

A fresh MariaDB instance actually has this:

mysql-11.8.5-MariaDB-ubu2404-log [(none)]> SELECT @@global.gtid_binlog_pos;
+--------------------------+
| @@global.gtid_binlog_pos |
+--------------------------+
|                          |
+--------------------------+
1 row in set (0.001 sec)

So if the dumpling dump is done before any other session has written anything then it won't have a GTID. Then this causes the GTID type in DM to be incorrectly set to a MySQL GTID.

I guess the easy workaround is to first run FLUSH PRIVILEGES or anything else that creates a GTID before running DM.

dveeden avatar Nov 27 '25 09:11 dveeden

Yes, first running FLUSH PRIVILEGES before starting the DM task works, then there is no DPANIC.

dveeden avatar Nov 27 '25 09:11 dveeden

As in https://github.com/pingcap/tidb/pull/64718#discussion_r2569278826, I think it's better to let DM check and fill the flavor of source database before run dump unit. @GMHDBJD PTAL

lance6716 avatar Nov 27 '25 15:11 lance6716

/severity moderate

dveeden avatar Nov 28 '25 10:11 dveeden