dtle
dtle copied to clipboard
mysql-mysql job full repl stage: big data cause the job to be stuck
Description
mysql-mysql job full repl stage: big data cause the job to be stuck
Steps to reproduce the issue
- src MySQL insert 2GB data
drop table if exists big_tx;
create table if not exists big_tx (id int primary key auto_increment, val longtext);
set @a = repeat('a', 1*1024*1024);
begin;
insert into big_tx values (0, @a);
insert into big_tx values (0, @a);
insert into big_tx values (0, @a);
insert into big_tx values (0, @a);
-- 2048 sqls, 2GB
commit;
- create dtle job, set chunk_size=10000
- after long time, check on dest MySQL
mysql> select count(*) from big_tx;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (0.00 sec)
- check src dtle log
022-03-24T17:07:35.938+0800 [DEBUG] client.driver_mgr.dtle: ShowCreateSchema: driver=dtle query="SHOW CREATE SCHEMA IF NOT EXISTS `action_db_1`" timestamp=2022-03-24T17:07:35.938+0800
2022-03-24T17:07:35.939+0800 [DEBUG] client.driver_mgr.dtle: ShowTables: driver=dtle query="SHOW TABLES IN `action_db_1`" timestamp=2022-03-24T17:07:35.938+0800
2022-03-24T17:07:35.939+0800 [INFO] client.driver_mgr.dtle: ValidateOriginalTable: driver=dtle @module=inspector job=big_tx-migration where=true timestamp=2022-03-24T17:07:35.939+0800
2022-03-24T17:07:35.944+0800 [DEBUG] client.driver_mgr.dtle: Potential unique keys.: driver=dtle @module=inspector job=big_tx-migration schema=action_db_1 table=big_tx uniqueKeys=["map[Columns:map[Columns:[map[Charset: ColumnType: Default:<nil> EscapedName:`id` IsUnsigned:false Key: Nullable:false Precision:0 RawName:id Scale:0 TimezoneConversion:<nil> Type:0]] Ordinals:map[id:0] UniqueKeys:<nil>] HasNullable:false IsAutoIncrement:true LastMaxVals:[] Name:PRIMARY]"] timestamp=2022-03-24T17:07:35.943+0800
2022-03-24T17:07:35.944+0800 [DEBUG] client.driver_mgr.dtle: table has unique keys: driver=dtle schema=action_db_1 table=big_tx @module=inspector job=big_tx-migration n_unique_keys=1 timestamp=2022-03-24T17:07:35.944+0800
2022-03-24T17:07:35.944+0800 [DEBUG] client.driver_mgr.dtle: a unique key: driver=dtle @module=inspector job=big_tx-migration uk="PRIMARY (auto_increment): [id]; has nullable: false" timestamp=2022-03-24T17:07:35.944+0800
2022-03-24T17:07:35.945+0800 [INFO] client.driver_mgr.dtle: chosen unique key: driver=dtle @module=inspector job=big_tx-migration schema=action_db_1 table=big_tx uk="PRIMARY (auto_increment): [id]; has nullable: false" timestamp=2022-03-24T17:07:35.945+0800
2022-03-24T17:07:35.945+0800 [INFO] client.driver_mgr.dtle: Examining table structure on extractor: driver=dtle job=big_tx-migration @module=dtle.extractor timestamp=2022-03-24T17:07:35.945+0800
2022-03-24T17:07:35.945+0800 [DEBUG] client.driver_mgr.dtle: getSchemaTablesAndMeta. after.: driver=dtle @module=dtle.extractor job=big_tx-migration timestamp=2022-03-24T17:07:35.945+0800
2022-03-24T17:07:35.946+0800 [INFO] client.driver_mgr.dtle: generating DROP and CREATE statements to reflect current database schemas: driver=dtle @module=dtle.extractor @warn="logging contained values that don't serialize to json" timestamp=2022-03-24T17:07:35.945+0800
2022-03-24T17:07:35.946+0800 [DEBUG] client.driver_mgr.dtle: publish. msg: driver=dtle gno=0 job=big_tx-migration msgLen=124 nSeg=1 spanLen=124 @module=dtle.extractor subject=big_tx-migration_full timestamp=2022-03-24T17:07:35.946+0800
2022-03-24T17:07:35.946+0800 [DEBUG] client.driver_mgr.dtle: publish: driver=dtle gno=0 iSeg=0 job=big_tx-migration partLen=124 subject=big_tx-migration_full @module=dtle.extractor timestamp=2022-03-24T17:07:35.946+0800
2022-03-24T17:07:35.947+0800 [DEBUG] client.driver_mgr.dtle: Estimated number of rows: driver=dtle @module=dtle.extractor job=big_tx-migration n=2042 schema=action_db_1 table=big_tx timestamp=2022-03-24T17:07:35.947+0800
2022-03-24T17:07:35.947+0800 [DEBUG] client.driver_mgr.dtle: publish. msg: driver=dtle gno=0 job=big_tx-migration msgLen=206 spanLen=206 @module=dtle.extractor nSeg=1 subject=big_tx-migration_full timestamp=2022-03-24T17:07:35.947+0800
2022-03-24T17:07:35.948+0800 [DEBUG] client.driver_mgr.dtle: publish: driver=dtle iSeg=0 partLen=206 subject=big_tx-migration_full gno=0 job=big_tx-migration @module=dtle.extractor timestamp=2022-03-24T17:07:35.947+0800
2022-03-24T17:07:35.948+0800 [INFO] client.driver_mgr.dtle: Step: scanning contents of x tables: driver=dtle @module=dtle.extractor job=big_tx-migration n=3 x=1 timestamp=2022-03-24T17:07:35.948+0800
2022-03-24T17:07:35.948+0800 [INFO] client.driver_mgr.dtle: Step n: - scanning table (i of N tables): driver=dtle @module=dtle.extractor N=1 job=big_tx-migration schema=action_db_1 i=1 n=3 table=big_tx timestamp=2022-03-24T17:07:35.948+0800
2022-03-24T17:07:35.948+0800 [DEBUG] client.driver_mgr.dtle: getChunkData.: driver=dtle @module=dumper job=big_tx-migration query="SELECT * FROM `action_db_1`.`big_tx` where (true) and (true) order by `id` asc LIMIT 100000" timestamp=2022-03-24T17:07:35.948+0800
2022-03-24T17:07:36.107+0800 [DEBUG] client: updated allocations: index=21 total=1 pulled=0 filtered=1
2022-03-24T17:07:36.108+0800 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=1
2022-03-24T17:07:36.108+0800 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=1 errors=0
2022-03-24T17:07:39.972+0800 [DEBUG] client.driver_mgr.dtle: getChunkData.: driver=dtle @module=dumper job=big_tx-migration n_row=2049 timestamp=2022-03-24T17:07:39.972+0800
2022-03-24T17:07:39.979+0800 [DEBUG] client.driver_mgr.dtle: GetLastMaxVal: driver=dtle @module=dumper job=big_tx-migration val=['2049'] timestamp=2022-03-24T17:07:39.978+0800
2022-03-24T17:07:39.979+0800 [DEBUG] client.driver_mgr.dtle: resultsChannel: driver=dtle @module=dumper job=big_tx-migration n=0 timestamp=2022-03-24T17:07:39.979+0800
2022-03-24T17:07:39.979+0800 [INFO] client.driver_mgr.dtle: nRows < d.chunkSize.: driver=dtle nRows=2049 @module=dumper chunkSize=100000 job=big_tx-migration timestamp=2022-03-24T17:07:39.979+0800
2022-03-24T17:07:39.979+0800 [DEBUG] client.driver_mgr.dtle: getChunkData.: driver=dtle @module=dumper job=big_tx-migration query="SELECT * FROM `action_db_1`.`big_tx` where (((`id` > '2049'))) and (true) order by `id` asc LIMIT 100000" timestamp=2022-03-24T17:07:39.979+0800
2022-03-24T17:07:39.980+0800 [DEBUG] client.driver_mgr.dtle: getChunkData.: driver=dtle @module=dumper job=big_tx-migration n_row=0 timestamp=2022-03-24T17:07:39.980+0800
2022-03-24T17:07:39.980+0800 [INFO] client.driver_mgr.dtle: nRows < d.chunkSize.: driver=dtle @module=dumper chunkSize=100000 job=big_tx-migration nRows=0 timestamp=2022-03-24T17:07:39.980+0800
2022-03-24T17:07:39.980+0800 [INFO] client.driver_mgr.dtle: nRows == 0. dump finished.: driver=dtle @module=dumper chunkSize=100000 job=big_tx-migration nRows=0 timestamp=2022-03-24T17:07:39.980+0800
2022-03-24T17:07:45.961+0800 [DEBUG] client: updated allocations: index=23 total=1 pulled=0 filtered=1
2022-03-24T17:07:45.962+0800 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=1
2022-03-24T17:07:45.962+0800 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=1 errors=0
2022-03-24T17:07:50.906+0800 [DEBUG] client.driver_mgr.dtle: TimestampContext.GetDelay: driver=dtle @module=dtle.extractor delay=0 job=big_tx-migration timestamp=2022-03-24T17:07:50.906+0800
2022-03-24T17:08:08.633+0800 [DEBUG] client.driver_mgr.dtle: TimestampContext.GetDelay: driver=dtle @module=dtle.extractor delay=0 job=big_tx-migration timestamp=2022-03-24T17:08:08.632+0800
2022-03-24T17:08:24.675+0800 [DEBUG] client.driver_mgr.dtle: TimestampContext.GetDelay: driver=dtle @module=dtle.extractor delay=0 job=big_tx-migration timestamp=2022-03-24T17:08:24.674+0800
- check dest dtle log
2022-03-24T17:07:36.032+0800 [DEBUG] client.driver_mgr.dtle: ApplyEventQueries. exec: driver=dtle job=big_tx-migration query="CREATE DATABASE /*!32312 IF NOT EXISTS*/ `action_db_1` /*!40100 DEFAULT CHARACTER SET latin1 */" @module=dtle.applier timestamp=2022-03-24T17:07:36.032+0800
2022-03-24T17:07:36.033+0800 [DEBUG] client.driver_mgr.dtle: ApplyEventQueries. after: driver=dtle nDumpEntry=1 @module=dtle.applier job=big_tx-migration timestamp=2022-03-24T17:07:36.033+0800
2022-03-24T17:07:36.033+0800 [DEBUG] client.driver_mgr.dtle: ApplyEventQueries: driver=dtle job=big_tx-migration rows=0 schema= table= @module=dtle.applier timestamp=2022-03-24T17:07:36.033+0800
2022-03-24T17:07:36.033+0800 [DEBUG] client.driver_mgr.dtle: ApplyEventQueries. exec: driver=dtle @module=dtle.applier job=big_tx-migration query="DROP TABLE IF EXISTS `action_db_1`.`big_tx`" timestamp=2022-03-24T17:07:36.033+0800
2022-03-24T17:07:36.034+0800 [DEBUG] client.driver_mgr.dtle: ApplyEventQueries. exec: driver=dtle @module=dtle.applier job=big_tx-migration query="CREATE TABLE `action_db_1`.`big_tx` (`id` INT(20) NOT NULL AUTO_INCREMENT,`val` LONGTEXT,PRIMARY KEY(`id`)) ENGINE = InnoDB AUTO_INCREMENT = 2050 DEFAULT CHARACTER SET = UTF8MB4" timestamp=2022-03-24T17:07:36.034+0800
2022-03-24T17:07:36.037+0800 [DEBUG] client.driver_mgr.dtle: ApplyEventQueries. after: driver=dtle @module=dtle.applier job=big_tx-migration nDumpEntry=0 timestamp=2022-03-24T17:07:36.037+0800
2022-03-24T17:07:36.196+0800 [DEBUG] client: updated allocations: index=22 total=1 pulled=0 filtered=1
2022-03-24T17:07:36.196+0800 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=1
2022-03-24T17:07:36.196+0800 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=1 errors=0
2022-03-24T17:07:46.031+0800 [DEBUG] client.driver_mgr.dtle: no binlogEntry for 10s: driver=dtle @module=dtle.applier job=big_tx-migration timestamp=2022-03-24T17:07:46.031+0800
2022-03-24T17:07:46.049+0800 [DEBUG] client: updated allocations: index=24 total=1 pulled=0 filtered=1
2022-03-24T17:07:46.049+0800 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=1
2022-03-24T17:07:46.049+0800 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=1 errors=0
2022-03-24T17:07:50.904+0800 [DEBUG] client.driver_mgr.dtle: Stats: driver=dtle @module=dtle.applier job=big_tx-migration timestamp=2022-03-24T17:07:50.903+0800
2022-03-24T17:07:50.904+0800 [DEBUG] client.driver_mgr.dtle: TimestampContext.GetDelay: driver=dtle delay=0 job=big_tx-migration @module=dtle.applier timestamp=2022-03-24T17:07:50.903+0800
2022-03-24T17:07:50.953+0800 [DEBUG] client.driver_mgr.dtle: SaveGtidForJob: driver=dtle @module=dtle.applier gtid= job=big_tx-migration timestamp=2022-03-24T17:07:50.953+0800
2022-03-24T17:07:55.953+0800 [DEBUG] client.driver_mgr.dtle: no copyRows for 10s.: driver=dtle @module=dtle.applier job=big_tx-migration timestamp=2022-03-24T17:07:55.952+0800
2022-03-24T17:07:56.031+0800 [DEBUG] client.driver_mgr.dtle: no binlogEntry for 10s: driver=dtle @module=dtle.applier job=big_tx-migration timestamp=2022-03-24T17:07:56.031+0800
2
Output of ./dtle version
:**
9.9.9.9-master-7ca85ee
/debug/pprof/goroutine?debug=1
goroutine profile: total 64
8 @ 0x43d0c5 0x470a18 0x4709ee 0x48c839 0x179a5d6 0x474801
# 0x4709ed sync.runtime_notifyListWait+0xcd /opt/go/src/runtime/sema.go:513
# 0x48c838 sync.(*Cond).Wait+0x98 /opt/go/src/sync/cond.go:56
# 0x179a5d5 github.com/nats-io/nats%2ego.(*Conn).waitForMsgs+0x635 /universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats.go/nats.go:2626
4 @ 0x43d0c5 0x40854f 0x4081cb 0x179bacf 0x474801
# 0x179bace github.com/nats-io/nats%2ego.(*Conn).flusher+0xee /universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats.go/nats.go:2978
4 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4e8735 0x4e8717 0x651eaf 0x666c31 0x150a0cc 0x170fa65 0x474801
# 0x46e774 internal/poll.runtime_pollWait+0x54 /opt/go/src/runtime/netpoll.go:227
# 0x4e7644 internal/poll.(*pollDesc).wait+0x44 /opt/go/src/internal/poll/fd_poll_runtime.go:87
# 0x4e8734 internal/poll.(*pollDesc).waitRead+0x1d4 /opt/go/src/internal/poll/fd_poll_runtime.go:92
# 0x4e8716 internal/poll.(*FD).Read+0x1b6 /opt/go/src/internal/poll/fd_unix.go:166
# 0x651eae net.(*netFD).Read+0x4e /opt/go/src/net/fd_posix.go:55
# 0x666c30 net.(*conn).Read+0x90 /opt/go/src/net/net.go:183
# 0x150a0cb github.com/nats-io/nats-server/v2/server.(*client).readLoop+0x110b /universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/client.go:1187
# 0x170fa64 github.com/nats-io/nats-server/v2/server.(*Server).createClient.func1+0x44 /universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/server.go:2568
4 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4e8735 0x4e8717 0x651eaf 0x666c31 0x1795398 0x1799e48 0x474801
# 0x46e774 internal/poll.runtime_pollWait+0x54 /opt/go/src/runtime/netpoll.go:227
# 0x4e7644 internal/poll.(*pollDesc).wait+0x44 /opt/go/src/internal/poll/fd_poll_runtime.go:87
# 0x4e8734 internal/poll.(*pollDesc).waitRead+0x1d4 /opt/go/src/internal/poll/fd_poll_runtime.go:92
# 0x4e8716 internal/poll.(*FD).Read+0x1b6 /opt/go/src/internal/poll/fd_unix.go:166
# 0x651eae net.(*netFD).Read+0x4e /opt/go/src/net/fd_posix.go:55
# 0x666c30 net.(*conn).Read+0x90 /opt/go/src/net/net.go:183
# 0x1795397 github.com/nats-io/nats%2ego.(*natsReader).Read+0xb7 /universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats.go/nats.go:1649
# 0x1799e47 github.com/nats-io/nats%2ego.(*Conn).readLoop+0xc7 /universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats.go/nats.go:2584
4 @ 0x43d0c5 0x470a18 0x4709ee 0x48c839 0x1508afe 0x170faaa 0x474801
# 0x4709ed sync.runtime_notifyListWait+0xcd /opt/go/src/runtime/sema.go:513
# 0x48c838 sync.(*Cond).Wait+0x98 /opt/go/src/sync/cond.go:56
# 0x1508afd github.com/nats-io/nats-server/v2/server.(*client).writeLoop+0x23d /universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/client.go:1061
# 0x170faa9 github.com/nats-io/nats-server/v2/server.(*Server).createClient.func2+0x29 /universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/server.go:2571
4 @ 0x43d0c5 0x470a18 0x4709ee 0x48c839 0x1799ab5 0x474801
# 0x4709ed sync.runtime_notifyListWait+0xcd /opt/go/src/runtime/sema.go:513
# 0x48c838 sync.(*Cond).Wait+0x98 /opt/go/src/sync/cond.go:56
# 0x1799ab4 github.com/nats-io/nats%2ego.(*asyncCallbacksHandler).asyncCBDispatcher+0x34 /universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats.go/nats.go:2509
3 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4e8735 0x4e8717 0x651eaf 0x666c31 0x773819 0x544ce8 0x545a5d 0x545cb4 0x705c96 0x76da4a 0x76da4b 0x774cbd 0x779945 0x474801
# 0x46e774 internal/poll.runtime_pollWait+0x54 /opt/go/src/runtime/netpoll.go:227
# 0x4e7644 internal/poll.(*pollDesc).wait+0x44 /opt/go/src/internal/poll/fd_poll_runtime.go:87
# 0x4e8734 internal/poll.(*pollDesc).waitRead+0x1d4 /opt/go/src/internal/poll/fd_poll_runtime.go:92
# 0x4e8716 internal/poll.(*FD).Read+0x1b6 /opt/go/src/internal/poll/fd_unix.go:166
# 0x651eae net.(*netFD).Read+0x4e /opt/go/src/net/fd_posix.go:55
# 0x666c30 net.(*conn).Read+0x90 /opt/go/src/net/net.go:183
# 0x773818 net/http.(*connReader).Read+0x1b8 /opt/go/src/net/http/server.go:780
# 0x544ce7 bufio.(*Reader).fill+0x107 /opt/go/src/bufio/bufio.go:101
# 0x545a5c bufio.(*Reader).ReadSlice+0x3c /opt/go/src/bufio/bufio.go:360
# 0x545cb3 bufio.(*Reader).ReadLine+0x33 /opt/go/src/bufio/bufio.go:389
# 0x705c95 net/textproto.(*Reader).readLineSlice+0xd5 /opt/go/src/net/textproto/reader.go:57
# 0x76da49 net/textproto.(*Reader).ReadLine+0xa9 /opt/go/src/net/textproto/reader.go:38
# 0x76da4a net/http.readRequest+0xaa /opt/go/src/net/http/request.go:1027
# 0x774cbc net/http.(*conn).readRequest+0x19c /opt/go/src/net/http/server.go:966
# 0x779944 net/http.(*conn).serve+0x704 /opt/go/src/net/http/server.go:1858
2 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4e8735 0x4e8717 0x4f1cb7 0x4f1c87 0x545382 0x1372f65 0x474801
# 0x46e774 internal/poll.runtime_pollWait+0x54 /opt/go/src/runtime/netpoll.go:227
# 0x4e7644 internal/poll.(*pollDesc).wait+0x44 /opt/go/src/internal/poll/fd_poll_runtime.go:87
# 0x4e8734 internal/poll.(*pollDesc).waitRead+0x1d4 /opt/go/src/internal/poll/fd_poll_runtime.go:92
# 0x4e8716 internal/poll.(*FD).Read+0x1b6 /opt/go/src/internal/poll/fd_unix.go:166
# 0x4f1cb6 os.(*File).read+0x76 /opt/go/src/os/file_posix.go:31
# 0x4f1c86 os.(*File).Read+0x46 /opt/go/src/os/file.go:117
# 0x545381 bufio.(*Reader).Read+0x221 /opt/go/src/bufio/bufio.go:227
# 0x1372f64 github.com/hashicorp/go-plugin.copyChan+0xc4 /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/grpc_stdio.go:181
1 @ 0x40e414 0x470f85 0x7c73e5 0x474801
# 0x470f84 os/signal.signal_recv+0xa4 /opt/go/src/runtime/sigqueue.go:168
# 0x7c73e4 os/signal.loop+0x24 /opt/go/src/os/signal/signal_unix.go:23
1 @ 0x43d0c5 0x40854f 0x40818b 0x1378d29 0x474801
# 0x1378d28 github.com/hashicorp/go-plugin.Serve.func3+0x88 /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/server.go:433
1 @ 0x43d0c5 0x40854f 0x40818b 0x1478ed8 0x13f6696 0xbd1658 0xbd2d28 0xbe0deb 0x474801
# 0x1478ed7 github.com/hashicorp/nomad/plugins/drivers.(*driverPluginServer).TaskEvents+0x97 /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/nomad/plugins/drivers/server.go:358
# 0x13f6695 github.com/hashicorp/nomad/plugins/drivers/proto._Driver_TaskEvents_Handler+0x115 /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/nomad/plugins/drivers/proto/driver.pb.go:4530
# 0xbd1657 google.golang.org/grpc.(*Server).processStreamingRPC+0xcd7 /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:1329
# 0xbd2d27 google.golang.org/grpc.(*Server).handleStream+0xc67 /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:1409
# 0xbe0dea google.golang.org/grpc.(*Server).serveStreams.func1.1+0xaa /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:746
1 @ 0x43d0c5 0x40854f 0x40818b 0x855dc9 0x474801
# 0x855dc8 github.com/actiontech/dtle/g.DumpLoop+0xa8 /universe/src/github.com/actiontech/dtle/g/dump.go:21
1 @ 0x43d0c5 0x40854f 0x40818b 0x856c2f 0x474801
# 0x856c2e github.com/actiontech/dtle/g.FreeMemoryWorker+0x4e /universe/src/github.com/actiontech/dtle/g/g.go:93
1 @ 0x43d0c5 0x40854f 0x40818b 0x85706b 0x474801
# 0x85706a github.com/actiontech/dtle/g.MemoryMonitor+0x6a /universe/src/github.com/actiontech/dtle/g/g.go:142
1 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4e8735 0x4e8717 0x651eaf 0x666c31 0x545382 0x4bb1a7 0xb51d69 0xb51d02 0xb525e5 0xb8bc3b 0xbcd282 0xbe0cff 0x474801
# 0x46e774 internal/poll.runtime_pollWait+0x54 /opt/go/src/runtime/netpoll.go:227
# 0x4e7644 internal/poll.(*pollDesc).wait+0x44 /opt/go/src/internal/poll/fd_poll_runtime.go:87
# 0x4e8734 internal/poll.(*pollDesc).waitRead+0x1d4 /opt/go/src/internal/poll/fd_poll_runtime.go:92
# 0x4e8716 internal/poll.(*FD).Read+0x1b6 /opt/go/src/internal/poll/fd_unix.go:166
# 0x651eae net.(*netFD).Read+0x4e /opt/go/src/net/fd_posix.go:55
# 0x666c30 net.(*conn).Read+0x90 /opt/go/src/net/net.go:183
# 0x545381 bufio.(*Reader).Read+0x221 /opt/go/src/bufio/bufio.go:227
# 0x4bb1a6 io.ReadAtLeast+0x86 /opt/go/src/io/io.go:328
# 0xb51d68 io.ReadFull+0x88 /opt/go/src/io/io.go:347
# 0xb51d01 golang.org/x/net/http2.readFrameHeader+0x21 /universe/src/github.com/actiontech/dtle/vendor/golang.org/x/net/http2/frame.go:237
# 0xb525e4 golang.org/x/net/http2.(*Framer).ReadFrame+0xa4 /universe/src/github.com/actiontech/dtle/vendor/golang.org/x/net/http2/frame.go:498
# 0xb8bc3a google.golang.org/grpc/internal/transport.(*http2Server).HandleStreams+0x9a /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/http2_server.go:453
# 0xbcd281 google.golang.org/grpc.(*Server).serveStreams+0xe1 /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:742
# 0xbe0cfe google.golang.org/grpc.(*Server).handleRawConn.func1+0x3e /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:703
1 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4e8735 0x4e8717 0x651eaf 0x666c31 0x773298 0x474801
# 0x46e774 internal/poll.runtime_pollWait+0x54 /opt/go/src/runtime/netpoll.go:227
# 0x4e7644 internal/poll.(*pollDesc).wait+0x44 /opt/go/src/internal/poll/fd_poll_runtime.go:87
# 0x4e8734 internal/poll.(*pollDesc).waitRead+0x1d4 /opt/go/src/internal/poll/fd_poll_runtime.go:92
# 0x4e8716 internal/poll.(*FD).Read+0x1b6 /opt/go/src/internal/poll/fd_unix.go:166
# 0x651eae net.(*netFD).Read+0x4e /opt/go/src/net/fd_posix.go:55
# 0x666c30 net.(*conn).Read+0x90 /opt/go/src/net/net.go:183
# 0x773297 net/http.(*connReader).backgroundRead+0x57 /opt/go/src/net/http/server.go:672
1 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4ea4b2 0x4ea494 0x6534e5 0x673172 0x671dc5 0x25d6c8f 0x77ea85 0x25d4853 0x2916c95 0x474801
# 0x46e774 internal/poll.runtime_pollWait+0x54 /opt/go/src/runtime/netpoll.go:227
# 0x4e7644 internal/poll.(*pollDesc).wait+0x44 /opt/go/src/internal/poll/fd_poll_runtime.go:87
# 0x4ea4b1 internal/poll.(*pollDesc).waitRead+0x211 /opt/go/src/internal/poll/fd_poll_runtime.go:92
# 0x4ea493 internal/poll.(*FD).Accept+0x1f3 /opt/go/src/internal/poll/fd_unix.go:401
# 0x6534e4 net.(*netFD).accept+0x44 /opt/go/src/net/fd_unix.go:172
# 0x673171 net.(*TCPListener).accept+0x31 /opt/go/src/net/tcpsock_posix.go:139
# 0x671dc4 net.(*TCPListener).AcceptTCP+0x64 /opt/go/src/net/tcpsock.go:248
# 0x25d6c8e github.com/labstack/echo/v4.tcpKeepAliveListener.Accept+0x2e /universe/src/github.com/actiontech/dtle/vendor/github.com/labstack/echo/v4/echo.go:963
# 0x77ea84 net/http.(*Server).Serve+0x284 /opt/go/src/net/http/server.go:2961
# 0x25d4852 github.com/labstack/echo/v4.(*Echo).Start+0xf2 /universe/src/github.com/actiontech/dtle/vendor/github.com/labstack/echo/v4/echo.go:671
# 0x2916c94 github.com/actiontech/dtle/drivers/api.SetupApiServer.func3+0x74 /universe/src/github.com/actiontech/dtle/drivers/api/route.go:179
1 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4ea4b2 0x4ea494 0x6534e5 0x673172 0x671f65 0x16bc1a2 0x474801
# 0x46e774 internal/poll.runtime_pollWait+0x54 /opt/go/src/runtime/netpoll.go:227
# 0x4e7644 internal/poll.(*pollDesc).wait+0x44 /opt/go/src/internal/poll/fd_poll_runtime.go:87
# 0x4ea4b1 internal/poll.(*pollDesc).waitRead+0x211 /opt/go/src/internal/poll/fd_poll_runtime.go:92
# 0x4ea493 internal/poll.(*FD).Accept+0x1f3 /opt/go/src/internal/poll/fd_unix.go:401
# 0x6534e4 net.(*netFD).accept+0x44 /opt/go/src/net/fd_unix.go:172
# 0x673171 net.(*TCPListener).accept+0x31 /opt/go/src/net/tcpsock_posix.go:139
# 0x671f64 net.(*TCPListener).Accept+0x64 /opt/go/src/net/tcpsock.go:261
# 0x16bc1a1 github.com/nats-io/nats-server/v2/server.(*Server).acceptConnections+0x41 /universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/server.go:2088
1 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4ea4b2 0x4ea494 0x6534e5 0x6797f2 0x677a45 0xbcc07f 0x137298e 0x474801
# 0x46e774 internal/poll.runtime_pollWait+0x54 /opt/go/src/runtime/netpoll.go:227
# 0x4e7644 internal/poll.(*pollDesc).wait+0x44 /opt/go/src/internal/poll/fd_poll_runtime.go:87
# 0x4ea4b1 internal/poll.(*pollDesc).waitRead+0x211 /opt/go/src/internal/poll/fd_poll_runtime.go:92
# 0x4ea493 internal/poll.(*FD).Accept+0x1f3 /opt/go/src/internal/poll/fd_unix.go:401
# 0x6534e4 net.(*netFD).accept+0x44 /opt/go/src/net/fd_unix.go:172
# 0x6797f1 net.(*UnixListener).accept+0x31 /opt/go/src/net/unixsock_posix.go:162
# 0x677a44 net.(*UnixListener).Accept+0x64 /opt/go/src/net/unixsock.go:260
# 0xbcc07e google.golang.org/grpc.(*Server).Serve+0x27e /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:621
# 0x137298d github.com/hashicorp/go-plugin.(*GRPCServer).Serve+0x6d /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/grpc_server.go:138
1 @ 0x43d0c5 0x44e2d7 0x1370686 0x137141a 0x474801
# 0x1370685 github.com/hashicorp/go-plugin.(*gRPCBrokerServer).Recv+0x85 /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/grpc_broker.go:121
# 0x1371419 github.com/hashicorp/go-plugin.(*GRPCBroker).Run+0x79 /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/grpc_broker.go:411
1 @ 0x43d0c5 0x44e2d7 0x1372d65 0x135def3 0xbd1658 0xbd2d28 0xbe0deb 0x474801
# 0x1372d64 github.com/hashicorp/go-plugin.(*grpcStdioServer).StreamStdio+0x144 /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/grpc_stdio.go:58
# 0x135def2 github.com/hashicorp/go-plugin/internal/plugin._GRPCStdio_StreamStdio_Handler+0x112 /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/internal/plugin/grpc_stdio.pb.go:185
# 0xbd1657 google.golang.org/grpc.(*Server).processStreamingRPC+0xcd7 /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:1329
# 0xbd2d27 google.golang.org/grpc.(*Server).handleStream+0xc67 /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:1409
# 0xbe0dea google.golang.org/grpc.(*Server).serveStreams.func1.1+0xaa /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:746
1 @ 0x43d0c5 0x44e2d7 0x13769d4 0x147631f 0x2917f48 0x2917dde 0x2918237 0x43cc96 0x474801
# 0x13769d3 github.com/hashicorp/go-plugin.Serve+0x953 /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/server.go:468
# 0x147631e github.com/hashicorp/nomad/plugins/drivers.Serve+0x23e /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/nomad/plugins/drivers/plugin.go:53
# 0x2917f47 github.com/hashicorp/nomad/plugins.serve+0x147 /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/nomad/plugins/serve.go:47
# 0x2917ddd github.com/hashicorp/nomad/plugins.Serve+0xbd /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/nomad/plugins/serve.go:26
# 0x2918236 main.main+0x2d6 /universe/src/github.com/actiontech/dtle/cmd/nomad-plugin/main.go:30
# 0x43cc95 runtime.main+0x255 /opt/go/src/runtime/proc.go:225
1 @ 0x43d0c5 0x44e2d7 0x13787cf 0x474801
# 0x13787ce github.com/hashicorp/go-plugin.(*gRPCBrokerServer).StartStream.func1+0x10e /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/grpc_broker.go:70
1 @ 0x43d0c5 0x44e2d7 0x1476b65 0x13f52d3 0xbd1658 0xbd2d28 0xbe0deb 0x474801
# 0x1476b64 github.com/hashicorp/nomad/plugins/drivers.(*driverPluginServer).Fingerprint+0x124 /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/nomad/plugins/drivers/server.go:75
# 0x13f52d2 github.com/hashicorp/nomad/plugins/drivers/proto._Driver_Fingerprint_Handler+0x112 /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/nomad/plugins/drivers/proto/driver.pb.go:4380
# 0xbd1657 google.golang.org/grpc.(*Server).processStreamingRPC+0xcd7 /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:1329
# 0xbd2d27 google.golang.org/grpc.(*Server).handleStream+0xc67 /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:1409
# 0xbe0dea google.golang.org/grpc.(*Server).serveStreams.func1.1+0xaa /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:746
1 @ 0x43d0c5 0x44e2d7 0x148094d 0x474801
# 0x148094c github.com/hashicorp/nomad/drivers/shared/eventer.(*Eventer).eventLoop+0xec /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/nomad/drivers/shared/eventer/eventer.go:66
1 @ 0x43d0c5 0x44e2d7 0x153f6d7 0x474801
# 0x153f6d6 github.com/nats-io/nats-server/v2/server.(*Server).internalSendLoop+0x116 /universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/events.go:319
1 @ 0x43d0c5 0x44e2d7 0x16afec5 0x474801
# 0x16afec4 github.com/nats-io/nats-server/v2/server.(*sendq).internalLoop+0x464 /universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/sendq.go:54
1 @ 0x43d0c5 0x44e2d7 0x16c833c 0x474801
# 0x16c833b github.com/nats-io/nats-server/v2/server.(*srvIPQueueLogger).run+0x11b /universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/server.go:3645
1 @ 0x43d0c5 0x44e2d7 0x17008c5 0x474801
# 0x17008c4 github.com/nats-io/nats-server/v2/server.(*Server).startGWReplyMapExpiration.func1+0x124 /universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/gateway.go:3143
1 @ 0x43d0c5 0x44e2d7 0x1710389 0x474801
# 0x1710388 github.com/nats-io/nats-server/v2/server.(*Server).handleSignals.func1+0xa8 /universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/signal.go:48
1 @ 0x43d0c5 0x44e2d7 0x18459ef 0x474801
# 0x18459ee github.com/nats-io/nats-streaming-server/server.(*StanServer).ioLoop+0x74e /universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-streaming-server/server/server.go:4379
1 @ 0x43d0c5 0x44e2d7 0x184bf85 0x474801
# 0x184bf84 github.com/nats-io/nats-streaming-server/server.(*StanServer).processSubscriptionsStart+0xe4 /universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-streaming-server/server/server.go:5393
1 @ 0x43d0c5 0x44e2d7 0x242718d 0x474801
# 0x242718c go.opencensus.io/stats/view.(*worker).start+0xcc /universe/src/github.com/actiontech/dtle/vendor/go.opencensus.io/stats/view/worker.go:276
1 @ 0x43d0c5 0x44e2d7 0x27e325d 0x474801
# 0x27e325c github.com/actiontech/dtle/drivers/mysql.(*Driver).handleFingerprint+0x13c /universe/src/github.com/actiontech/dtle/drivers/mysql/driver.go:404
1 @ 0x43d0c5 0x44e2d7 0xb76eff 0xb7785d 0xb9a017 0x474801
# 0xb76efe google.golang.org/grpc/internal/transport.(*controlBuffer).get+0xfe /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/controlbuf.go:395
# 0xb7785c google.golang.org/grpc/internal/transport.(*loopyWriter).run+0x1dc /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/controlbuf.go:513
# 0xb9a016 google.golang.org/grpc/internal/transport.newHTTP2Server.func2+0xd6 /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/http2_server.go:291
1 @ 0x43d0c5 0x44e2d7 0xb90145 0x474801
# 0xb90144 google.golang.org/grpc/internal/transport.(*http2Server).keepalive+0x264 /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/http2_server.go:966
1 @ 0x43d0c5 0x44e2d7 0xb960c5 0xb95f91 0xb96ff5 0x4bb1a7 0xb96f32 0xb96eef 0xbc8523 0xbc914d 0xbc9a5b 0xbdd856 0x135ce82 0x13701ea 0x135cd4d 0xbd1658 0xbd2d28 0xbe0deb 0x474801
# 0xb960c4 google.golang.org/grpc/internal/transport.(*recvBufferReader).read+0xa4 /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/transport.go:174
# 0xb95f90 google.golang.org/grpc/internal/transport.(*recvBufferReader).Read+0x210 /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/transport.go:168
# 0xb96ff4 google.golang.org/grpc/internal/transport.(*transportReader).Read+0x54 /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/transport.go:479
# 0x4bb1a6 io.ReadAtLeast+0x86 /opt/go/src/io/io.go:328
# 0xb96f31 io.ReadFull+0xd1 /opt/go/src/io/io.go:347
# 0xb96eee google.golang.org/grpc/internal/transport.(*Stream).Read+0x8e /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/transport.go:463
# 0xbc8522 google.golang.org/grpc.(*parser).recvMsg+0x62 /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/rpc_util.go:510
# 0xbc914c google.golang.org/grpc.recvAndDecompress+0x4c /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/rpc_util.go:641
# 0xbc9a5a google.golang.org/grpc.recv+0x9a /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/rpc_util.go:709
# 0xbdd855 google.golang.org/grpc.(*serverStream).RecvMsg+0x155 /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/stream.go:1473
# 0x135ce81 github.com/hashicorp/go-plugin/internal/plugin.(*gRPCBrokerStartStreamServer).Recv+0x61 /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/internal/plugin/grpc_broker.pb.go:167
# 0x13701e9 github.com/hashicorp/go-plugin.(*gRPCBrokerServer).StartStream+0x109 /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/grpc_broker.go:84
# 0x135cd4c github.com/hashicorp/go-plugin/internal/plugin._GRPCBroker_StartStream_Handler+0xac /universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/internal/plugin/grpc_broker.pb.go:148
# 0xbd1657 google.golang.org/grpc.(*Server).processStreamingRPC+0xcd7 /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:1329
# 0xbd2d27 google.golang.org/grpc.(*Server).handleStream+0xc67 /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:1409
# 0xbe0dea google.golang.org/grpc.(*Server).serveStreams.func1.1+0xaa /universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:746
1 @ 0x43d0c5 0x471832 0x101eb8f 0x474801
# 0x471831 time.Sleep+0xd1 /opt/go/src/runtime/time.go:193
# 0x101eb8e github.com/armon/go-metrics.(*Metrics).collectStats+0x2e /universe/src/github.com/actiontech/dtle/vendor/github.com/armon/go-metrics/metrics.go:230
1 @ 0x46e31d 0x7b644e 0x7b6225 0x7b2db2 0x7c47c5 0x7c6017 0x77b024 0x77cead 0x25dc37f 0x25dc102 0x25d44bc 0x77e6a3 0x779b0d 0x474801
# 0x46e31c runtime/pprof.runtime_goroutineProfileWithLabels+0x5c /opt/go/src/runtime/mprof.go:716
# 0x7b644d runtime/pprof.writeRuntimeProfile+0xcd /opt/go/src/runtime/pprof/pprof.go:724
# 0x7b6224 runtime/pprof.writeGoroutine+0xa4 /opt/go/src/runtime/pprof/pprof.go:684
# 0x7b2db1 runtime/pprof.(*Profile).WriteTo+0x3f1 /opt/go/src/runtime/pprof/pprof.go:331
# 0x7c47c4 net/http/pprof.handler.ServeHTTP+0x384 /opt/go/src/net/http/pprof/pprof.go:253
# 0x7c6016 net/http/pprof.Index+0x8d6 /opt/go/src/net/http/pprof/pprof.go:371
# 0x77b023 net/http.HandlerFunc.ServeHTTP+0x43 /opt/go/src/net/http/server.go:2049
# 0x77ceac net/http.(*ServeMux).ServeHTTP+0x1ac /opt/go/src/net/http/server.go:2428
# 0x25dc37e github.com/labstack/echo/v4.WrapHandler.func1+0x9e /universe/src/github.com/actiontech/dtle/vendor/github.com/labstack/echo/v4/echo.go:901
# 0x25dc101 github.com/labstack/echo/v4.(*Echo).add.func1+0x61 /universe/src/github.com/actiontech/dtle/vendor/github.com/labstack/echo/v4/echo.go:544
# 0x25d44bb github.com/labstack/echo/v4.(*Echo).ServeHTTP+0x2bb /universe/src/github.com/actiontech/dtle/vendor/github.com/labstack/echo/v4/echo.go:654
# 0x77e6a2 net/http.serverHandler.ServeHTTP+0xa2 /opt/go/src/net/http/server.go:2867
# 0x779b0c net/http.(*conn).serve+0x8cc /opt/go/src/net/http/server.go:1932
这个issue复现失败了
- 根据issue的步骤,插入的sql是4M的能够直接同步
- 我自己手动搞了个8G的事务,也同步成功了
drop table if exists big_tx;
create table if not exists big_tx (id int primary key auto_increment, val longtext);
set @a = repeat('a', 1*1024*1024);
DELIMITER //
CREATE PROCEDURE repeatCourse()
BEGIN
DECLARE num INT;
SET num = 1;
set @a = repeat('a', 1*1024*1024);
WHILE num < 2048 DO
insert into big_tx values (0, @a);
insert into big_tx values (0, @a);
SET num = num + 1;
END WHILE;
END //
CALL repeatCourse();
从日志看来似乎也没问题,看看有没有重现的步骤
已知
- 每行1MB
- 2048行, 共2GB
- 一次select完&发送&执行
分析日志得
# src: select 完毕
2022-08-30T12:39:06.073+0800 [INFO] client.driver_mgr.dtle: nRows == 0. dump finished.: driver=dtle @module=dumper chunkSize=100000 job=big_tx_1_mb-migration nRows=0 timestamp=2022-08-30T12:39:06.073+0800
# src: 1.5小时后发送消息, 消息大小2.6MB (考虑压缩)
2022-08-30T14:16:12.419+0800 [DEBUG] client.driver_mgr.dtle: publish. msg: driver=dtle gno=0 msgLen=2.674695e+06 spanLen=2.674695e+06 nSeg=1 subject=big_tx_1_mb-migration_full @module=dtle.extractor job=big_tx_1_mb-migration timestamp=2022-08-30T14:16:12.419+0800
# dst: 收到消息
2022-08-30T14:16:43.314+0800 [DEBUG] client.driver_mgr.dtle: NatsMsgMerger.Handle found ordinary msg: driver=dtle nmm=full @module=dtle.applier job=big_tx_1_mb-migration lenData=2.674695e+06 timestamp=2022-08-30T14:16:43.313+0800
# src: 发送_full_complete信号
2022-08-30T14:16:43.320+0800 [DEBUG] client.driver_mgr.dtle: publish: driver=dtle subject=big_tx_1_mb-migration_full_complete gno=0 job=big_tx_1_mb-migration @module=dtle.extractor iSeg=0 partLen=133 timestamp=2022-08-30T14:16:43.320+0800
# dst: 收到_full_complete
2022-08-30T14:16:43.321+0800 [DEBUG] client.driver_mgr.dtle: recv _full_complete.: driver=dtle @module=dtle.applier job=big_tx_1_mb-migration timestamp=2022-08-30T14:16:43.320+0800
# dst: nDumpEntry is not zero应该是1秒1次, 但不知道为什么, 后来没了
2022-08-30T14:16:43.321+0800 [DEBUG] client.driver_mgr.dtle: nDumpEntry is not zero, waiting: driver=dtle job=big_tx_1_mb-migration nDumpEntry=1 @module=dtle.applier timestamp=2022-08-30T14:16:43.320+0800
...
2022-08-30T14:18:57.198+0800 [DEBUG] client.driver_mgr.dtle: nDumpEntry is not zero, waiting: driver=dtle nDumpEntry=1 @module=dtle.applier job=big_tx_1_mb-migration timestamp=2022-08-30T14:18:57.198+0800
# src: err: nats: connection closed, 应该是_full_complete subject上的报错
2022-08-30T14:25:17.773+0800 [ERROR] client.driver_mgr.dtle: unexpected error on publish: driver=dtle @module=dtle.extractor err="nats: connection closed" job=big_tx_1_mb-migration timestamp=2022-08-30T14:25:17.772+0800
# src: 重启
2022-08-30T14:25:35.480+0800 [INFO] client.driver_mgr.dtle: StartTask: driver=dtle jobName=big_tx_1_mb-migration taskName=src @module=dtle ID=f9fb762f-229d-f48e-9a7d-faf554778315/src/c2d55da7 allocID=f9fb762f-229d-f48e-9a7d-faf554778315 timestamp=2022-08-30T14:25:35.480+0800
# dst: 跟随重启
2022-08-30T14:39:57.346+0800 [ERROR] client.driver_mgr.dtle: onError: driver=dtle job=big_tx_1_mb-migration @module=dtle.applier err="DstPutNats: NatsAddr changed to wait. will restart dst" timestamp=2022-08-30T14:39:57.346+0800
问题
- select完毕到发送消息, 时间间隔1.5h, 是否正常?
-
nDumpEntry is not zero, waiting
为何不打印了?- waiting结束的路径有两种, shutdown或
Rows copy complete
, 没有这些迹象.
- waiting结束的路径有两种, shutdown或
-
no binlogEntry for 10s
的打印也中断了
潜在优化点
全量select数据量大时, 分块发送.
测试机内存情况
$ free -h
total used free shared buff/cache available
Mem: 251G 219G 4.4G 15G 28G 15G
Swap: 8.0G 6.9G 1.1G
使用一个每次分配、填充128MB, 共分配16个的go程序, 出现了分配困难. 此时该程序CPU占用100%.
2022/09/02 17:56:16 alloc. before
2022/09/02 17:56:16 make. 0
2022/09/02 17:56:17 fill. 0
2022/09/02 17:56:17 make. 1
2022/09/02 17:56:17 fill. 1
2022/09/02 17:56:17 make. 2
2022/09/02 17:56:17 fill. 2
2022/09/02 17:56:17 make. 3
2022/09/02 17:59:41 fill. 3
2022/09/02 17:59:41 make. 4
将每次分配数量下调至4MB、2MB, 则一定程度上能继续分配.
411d21d2: 增量(大事务)也使用该参数进行分割