dtle icon indicating copy to clipboard operation
dtle copied to clipboard

mysql-mysql job full repl stage: big data cause the job to be stuck

Open asiroliu opened this issue 2 years ago • 2 comments

Description

mysql-mysql job full repl stage: big data cause the job to be stuck

Steps to reproduce the issue

  1. src MySQL insert 2GB data
drop table if exists big_tx;
create table if not exists big_tx (id int primary key auto_increment, val longtext);
set @a = repeat('a', 1*1024*1024);

begin;
insert into big_tx values (0, @a);
insert into big_tx values (0, @a);
insert into big_tx values (0, @a);
insert into big_tx values (0, @a);
--  2048 sqls, 2GB
commit;
  1. create dtle job, set chunk_size=10000
  2. after long time, check on dest MySQL
mysql> select count(*) from big_tx;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)
  1. check src dtle log
022-03-24T17:07:35.938+0800 [DEBUG] client.driver_mgr.dtle: ShowCreateSchema: driver=dtle query="SHOW CREATE SCHEMA IF NOT EXISTS `action_db_1`" timestamp=2022-03-24T17:07:35.938+0800
2022-03-24T17:07:35.939+0800 [DEBUG] client.driver_mgr.dtle: ShowTables: driver=dtle query="SHOW TABLES IN `action_db_1`" timestamp=2022-03-24T17:07:35.938+0800
2022-03-24T17:07:35.939+0800 [INFO]  client.driver_mgr.dtle: ValidateOriginalTable: driver=dtle @module=inspector job=big_tx-migration where=true timestamp=2022-03-24T17:07:35.939+0800
2022-03-24T17:07:35.944+0800 [DEBUG] client.driver_mgr.dtle: Potential unique keys.: driver=dtle @module=inspector job=big_tx-migration schema=action_db_1 table=big_tx uniqueKeys=["map[Columns:map[Columns:[map[Charset: ColumnType: Default:<nil> EscapedName:`id` IsUnsigned:false Key: Nullable:false Precision:0 RawName:id Scale:0 TimezoneConversion:<nil> Type:0]] Ordinals:map[id:0] UniqueKeys:<nil>] HasNullable:false IsAutoIncrement:true LastMaxVals:[] Name:PRIMARY]"] timestamp=2022-03-24T17:07:35.943+0800
2022-03-24T17:07:35.944+0800 [DEBUG] client.driver_mgr.dtle: table has unique keys: driver=dtle schema=action_db_1 table=big_tx @module=inspector job=big_tx-migration n_unique_keys=1 timestamp=2022-03-24T17:07:35.944+0800
2022-03-24T17:07:35.944+0800 [DEBUG] client.driver_mgr.dtle: a unique key: driver=dtle @module=inspector job=big_tx-migration uk="PRIMARY (auto_increment): [id]; has nullable: false" timestamp=2022-03-24T17:07:35.944+0800
2022-03-24T17:07:35.945+0800 [INFO]  client.driver_mgr.dtle: chosen unique key: driver=dtle @module=inspector job=big_tx-migration schema=action_db_1 table=big_tx uk="PRIMARY (auto_increment): [id]; has nullable: false" timestamp=2022-03-24T17:07:35.945+0800
2022-03-24T17:07:35.945+0800 [INFO]  client.driver_mgr.dtle: Examining table structure on extractor: driver=dtle job=big_tx-migration @module=dtle.extractor timestamp=2022-03-24T17:07:35.945+0800
2022-03-24T17:07:35.945+0800 [DEBUG] client.driver_mgr.dtle: getSchemaTablesAndMeta. after.: driver=dtle @module=dtle.extractor job=big_tx-migration timestamp=2022-03-24T17:07:35.945+0800
2022-03-24T17:07:35.946+0800 [INFO]  client.driver_mgr.dtle: generating DROP and CREATE statements to reflect current database schemas: driver=dtle @module=dtle.extractor @warn="logging contained values that don't serialize to json" timestamp=2022-03-24T17:07:35.945+0800
2022-03-24T17:07:35.946+0800 [DEBUG] client.driver_mgr.dtle: publish. msg: driver=dtle gno=0 job=big_tx-migration msgLen=124 nSeg=1 spanLen=124 @module=dtle.extractor subject=big_tx-migration_full timestamp=2022-03-24T17:07:35.946+0800
2022-03-24T17:07:35.946+0800 [DEBUG] client.driver_mgr.dtle: publish: driver=dtle gno=0 iSeg=0 job=big_tx-migration partLen=124 subject=big_tx-migration_full @module=dtle.extractor timestamp=2022-03-24T17:07:35.946+0800
2022-03-24T17:07:35.947+0800 [DEBUG] client.driver_mgr.dtle: Estimated number of rows: driver=dtle @module=dtle.extractor job=big_tx-migration n=2042 schema=action_db_1 table=big_tx timestamp=2022-03-24T17:07:35.947+0800
2022-03-24T17:07:35.947+0800 [DEBUG] client.driver_mgr.dtle: publish. msg: driver=dtle gno=0 job=big_tx-migration msgLen=206 spanLen=206 @module=dtle.extractor nSeg=1 subject=big_tx-migration_full timestamp=2022-03-24T17:07:35.947+0800
2022-03-24T17:07:35.948+0800 [DEBUG] client.driver_mgr.dtle: publish: driver=dtle iSeg=0 partLen=206 subject=big_tx-migration_full gno=0 job=big_tx-migration @module=dtle.extractor timestamp=2022-03-24T17:07:35.947+0800
2022-03-24T17:07:35.948+0800 [INFO]  client.driver_mgr.dtle: Step: scanning contents of x tables: driver=dtle @module=dtle.extractor job=big_tx-migration n=3 x=1 timestamp=2022-03-24T17:07:35.948+0800
2022-03-24T17:07:35.948+0800 [INFO]  client.driver_mgr.dtle: Step n: - scanning table (i of N tables): driver=dtle @module=dtle.extractor N=1 job=big_tx-migration schema=action_db_1 i=1 n=3 table=big_tx timestamp=2022-03-24T17:07:35.948+0800
2022-03-24T17:07:35.948+0800 [DEBUG] client.driver_mgr.dtle: getChunkData.: driver=dtle @module=dumper job=big_tx-migration query="SELECT * FROM `action_db_1`.`big_tx` where (true) and (true) order by `id` asc LIMIT 100000" timestamp=2022-03-24T17:07:35.948+0800
2022-03-24T17:07:36.107+0800 [DEBUG] client: updated allocations: index=21 total=1 pulled=0 filtered=1
2022-03-24T17:07:36.108+0800 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=1
2022-03-24T17:07:36.108+0800 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=1 errors=0
2022-03-24T17:07:39.972+0800 [DEBUG] client.driver_mgr.dtle: getChunkData.: driver=dtle @module=dumper job=big_tx-migration n_row=2049 timestamp=2022-03-24T17:07:39.972+0800
2022-03-24T17:07:39.979+0800 [DEBUG] client.driver_mgr.dtle: GetLastMaxVal: driver=dtle @module=dumper job=big_tx-migration val=['2049'] timestamp=2022-03-24T17:07:39.978+0800
2022-03-24T17:07:39.979+0800 [DEBUG] client.driver_mgr.dtle: resultsChannel: driver=dtle @module=dumper job=big_tx-migration n=0 timestamp=2022-03-24T17:07:39.979+0800
2022-03-24T17:07:39.979+0800 [INFO]  client.driver_mgr.dtle: nRows < d.chunkSize.: driver=dtle nRows=2049 @module=dumper chunkSize=100000 job=big_tx-migration timestamp=2022-03-24T17:07:39.979+0800
2022-03-24T17:07:39.979+0800 [DEBUG] client.driver_mgr.dtle: getChunkData.: driver=dtle @module=dumper job=big_tx-migration query="SELECT * FROM `action_db_1`.`big_tx` where (((`id` > '2049'))) and (true) order by `id` asc LIMIT 100000" timestamp=2022-03-24T17:07:39.979+0800
2022-03-24T17:07:39.980+0800 [DEBUG] client.driver_mgr.dtle: getChunkData.: driver=dtle @module=dumper job=big_tx-migration n_row=0 timestamp=2022-03-24T17:07:39.980+0800
2022-03-24T17:07:39.980+0800 [INFO]  client.driver_mgr.dtle: nRows < d.chunkSize.: driver=dtle @module=dumper chunkSize=100000 job=big_tx-migration nRows=0 timestamp=2022-03-24T17:07:39.980+0800
2022-03-24T17:07:39.980+0800 [INFO]  client.driver_mgr.dtle: nRows == 0. dump finished.: driver=dtle @module=dumper chunkSize=100000 job=big_tx-migration nRows=0 timestamp=2022-03-24T17:07:39.980+0800
2022-03-24T17:07:45.961+0800 [DEBUG] client: updated allocations: index=23 total=1 pulled=0 filtered=1
2022-03-24T17:07:45.962+0800 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=1
2022-03-24T17:07:45.962+0800 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=1 errors=0
2022-03-24T17:07:50.906+0800 [DEBUG] client.driver_mgr.dtle: TimestampContext.GetDelay: driver=dtle @module=dtle.extractor delay=0 job=big_tx-migration timestamp=2022-03-24T17:07:50.906+0800
2022-03-24T17:08:08.633+0800 [DEBUG] client.driver_mgr.dtle: TimestampContext.GetDelay: driver=dtle @module=dtle.extractor delay=0 job=big_tx-migration timestamp=2022-03-24T17:08:08.632+0800
2022-03-24T17:08:24.675+0800 [DEBUG] client.driver_mgr.dtle: TimestampContext.GetDelay: driver=dtle @module=dtle.extractor delay=0 job=big_tx-migration timestamp=2022-03-24T17:08:24.674+0800
  1. check dest dtle log
2022-03-24T17:07:36.032+0800 [DEBUG] client.driver_mgr.dtle: ApplyEventQueries. exec: driver=dtle job=big_tx-migration query="CREATE DATABASE /*!32312 IF NOT EXISTS*/ `action_db_1` /*!40100 DEFAULT CHARACTER SET latin1 */" @module=dtle.applier timestamp=2022-03-24T17:07:36.032+0800
2022-03-24T17:07:36.033+0800 [DEBUG] client.driver_mgr.dtle: ApplyEventQueries. after: driver=dtle nDumpEntry=1 @module=dtle.applier job=big_tx-migration timestamp=2022-03-24T17:07:36.033+0800
2022-03-24T17:07:36.033+0800 [DEBUG] client.driver_mgr.dtle: ApplyEventQueries: driver=dtle job=big_tx-migration rows=0 schema= table= @module=dtle.applier timestamp=2022-03-24T17:07:36.033+0800
2022-03-24T17:07:36.033+0800 [DEBUG] client.driver_mgr.dtle: ApplyEventQueries. exec: driver=dtle @module=dtle.applier job=big_tx-migration query="DROP TABLE IF EXISTS `action_db_1`.`big_tx`" timestamp=2022-03-24T17:07:36.033+0800
2022-03-24T17:07:36.034+0800 [DEBUG] client.driver_mgr.dtle: ApplyEventQueries. exec: driver=dtle @module=dtle.applier job=big_tx-migration query="CREATE TABLE `action_db_1`.`big_tx` (`id` INT(20) NOT NULL AUTO_INCREMENT,`val` LONGTEXT,PRIMARY KEY(`id`)) ENGINE = InnoDB AUTO_INCREMENT = 2050 DEFAULT CHARACTER SET = UTF8MB4" timestamp=2022-03-24T17:07:36.034+0800
2022-03-24T17:07:36.037+0800 [DEBUG] client.driver_mgr.dtle: ApplyEventQueries. after: driver=dtle @module=dtle.applier job=big_tx-migration nDumpEntry=0 timestamp=2022-03-24T17:07:36.037+0800
2022-03-24T17:07:36.196+0800 [DEBUG] client: updated allocations: index=22 total=1 pulled=0 filtered=1
2022-03-24T17:07:36.196+0800 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=1
2022-03-24T17:07:36.196+0800 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=1 errors=0
2022-03-24T17:07:46.031+0800 [DEBUG] client.driver_mgr.dtle: no binlogEntry for 10s: driver=dtle @module=dtle.applier job=big_tx-migration timestamp=2022-03-24T17:07:46.031+0800
2022-03-24T17:07:46.049+0800 [DEBUG] client: updated allocations: index=24 total=1 pulled=0 filtered=1
2022-03-24T17:07:46.049+0800 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=1
2022-03-24T17:07:46.049+0800 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=1 errors=0
2022-03-24T17:07:50.904+0800 [DEBUG] client.driver_mgr.dtle: Stats: driver=dtle @module=dtle.applier job=big_tx-migration timestamp=2022-03-24T17:07:50.903+0800
2022-03-24T17:07:50.904+0800 [DEBUG] client.driver_mgr.dtle: TimestampContext.GetDelay: driver=dtle delay=0 job=big_tx-migration @module=dtle.applier timestamp=2022-03-24T17:07:50.903+0800
2022-03-24T17:07:50.953+0800 [DEBUG] client.driver_mgr.dtle: SaveGtidForJob: driver=dtle @module=dtle.applier gtid= job=big_tx-migration timestamp=2022-03-24T17:07:50.953+0800
2022-03-24T17:07:55.953+0800 [DEBUG] client.driver_mgr.dtle: no copyRows for 10s.: driver=dtle @module=dtle.applier job=big_tx-migration timestamp=2022-03-24T17:07:55.952+0800
2022-03-24T17:07:56.031+0800 [DEBUG] client.driver_mgr.dtle: no binlogEntry for 10s: driver=dtle @module=dtle.applier job=big_tx-migration timestamp=2022-03-24T17:07:56.031+0800
2

Output of ./dtle version:**

9.9.9.9-master-7ca85ee

asiroliu avatar Mar 24 '22 09:03 asiroliu

/debug/pprof/goroutine?debug=1

goroutine profile: total 64
8 @ 0x43d0c5 0x470a18 0x4709ee 0x48c839 0x179a5d6 0x474801
#	0x4709ed	sync.runtime_notifyListWait+0xcd			/opt/go/src/runtime/sema.go:513
#	0x48c838	sync.(*Cond).Wait+0x98					/opt/go/src/sync/cond.go:56
#	0x179a5d5	github.com/nats-io/nats%2ego.(*Conn).waitForMsgs+0x635	/universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats.go/nats.go:2626

4 @ 0x43d0c5 0x40854f 0x4081cb 0x179bacf 0x474801
#	0x179bace	github.com/nats-io/nats%2ego.(*Conn).flusher+0xee	/universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats.go/nats.go:2978

4 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4e8735 0x4e8717 0x651eaf 0x666c31 0x150a0cc 0x170fa65 0x474801
#	0x46e774	internal/poll.runtime_pollWait+0x54						/opt/go/src/runtime/netpoll.go:227
#	0x4e7644	internal/poll.(*pollDesc).wait+0x44						/opt/go/src/internal/poll/fd_poll_runtime.go:87
#	0x4e8734	internal/poll.(*pollDesc).waitRead+0x1d4					/opt/go/src/internal/poll/fd_poll_runtime.go:92
#	0x4e8716	internal/poll.(*FD).Read+0x1b6							/opt/go/src/internal/poll/fd_unix.go:166
#	0x651eae	net.(*netFD).Read+0x4e								/opt/go/src/net/fd_posix.go:55
#	0x666c30	net.(*conn).Read+0x90								/opt/go/src/net/net.go:183
#	0x150a0cb	github.com/nats-io/nats-server/v2/server.(*client).readLoop+0x110b		/universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/client.go:1187
#	0x170fa64	github.com/nats-io/nats-server/v2/server.(*Server).createClient.func1+0x44	/universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/server.go:2568

4 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4e8735 0x4e8717 0x651eaf 0x666c31 0x1795398 0x1799e48 0x474801
#	0x46e774	internal/poll.runtime_pollWait+0x54			/opt/go/src/runtime/netpoll.go:227
#	0x4e7644	internal/poll.(*pollDesc).wait+0x44			/opt/go/src/internal/poll/fd_poll_runtime.go:87
#	0x4e8734	internal/poll.(*pollDesc).waitRead+0x1d4		/opt/go/src/internal/poll/fd_poll_runtime.go:92
#	0x4e8716	internal/poll.(*FD).Read+0x1b6				/opt/go/src/internal/poll/fd_unix.go:166
#	0x651eae	net.(*netFD).Read+0x4e					/opt/go/src/net/fd_posix.go:55
#	0x666c30	net.(*conn).Read+0x90					/opt/go/src/net/net.go:183
#	0x1795397	github.com/nats-io/nats%2ego.(*natsReader).Read+0xb7	/universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats.go/nats.go:1649
#	0x1799e47	github.com/nats-io/nats%2ego.(*Conn).readLoop+0xc7	/universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats.go/nats.go:2584

4 @ 0x43d0c5 0x470a18 0x4709ee 0x48c839 0x1508afe 0x170faaa 0x474801
#	0x4709ed	sync.runtime_notifyListWait+0xcd						/opt/go/src/runtime/sema.go:513
#	0x48c838	sync.(*Cond).Wait+0x98								/opt/go/src/sync/cond.go:56
#	0x1508afd	github.com/nats-io/nats-server/v2/server.(*client).writeLoop+0x23d		/universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/client.go:1061
#	0x170faa9	github.com/nats-io/nats-server/v2/server.(*Server).createClient.func2+0x29	/universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/server.go:2571

4 @ 0x43d0c5 0x470a18 0x4709ee 0x48c839 0x1799ab5 0x474801
#	0x4709ed	sync.runtime_notifyListWait+0xcd						/opt/go/src/runtime/sema.go:513
#	0x48c838	sync.(*Cond).Wait+0x98								/opt/go/src/sync/cond.go:56
#	0x1799ab4	github.com/nats-io/nats%2ego.(*asyncCallbacksHandler).asyncCBDispatcher+0x34	/universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats.go/nats.go:2509

3 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4e8735 0x4e8717 0x651eaf 0x666c31 0x773819 0x544ce8 0x545a5d 0x545cb4 0x705c96 0x76da4a 0x76da4b 0x774cbd 0x779945 0x474801
#	0x46e774	internal/poll.runtime_pollWait+0x54		/opt/go/src/runtime/netpoll.go:227
#	0x4e7644	internal/poll.(*pollDesc).wait+0x44		/opt/go/src/internal/poll/fd_poll_runtime.go:87
#	0x4e8734	internal/poll.(*pollDesc).waitRead+0x1d4	/opt/go/src/internal/poll/fd_poll_runtime.go:92
#	0x4e8716	internal/poll.(*FD).Read+0x1b6			/opt/go/src/internal/poll/fd_unix.go:166
#	0x651eae	net.(*netFD).Read+0x4e				/opt/go/src/net/fd_posix.go:55
#	0x666c30	net.(*conn).Read+0x90				/opt/go/src/net/net.go:183
#	0x773818	net/http.(*connReader).Read+0x1b8		/opt/go/src/net/http/server.go:780
#	0x544ce7	bufio.(*Reader).fill+0x107			/opt/go/src/bufio/bufio.go:101
#	0x545a5c	bufio.(*Reader).ReadSlice+0x3c			/opt/go/src/bufio/bufio.go:360
#	0x545cb3	bufio.(*Reader).ReadLine+0x33			/opt/go/src/bufio/bufio.go:389
#	0x705c95	net/textproto.(*Reader).readLineSlice+0xd5	/opt/go/src/net/textproto/reader.go:57
#	0x76da49	net/textproto.(*Reader).ReadLine+0xa9		/opt/go/src/net/textproto/reader.go:38
#	0x76da4a	net/http.readRequest+0xaa			/opt/go/src/net/http/request.go:1027
#	0x774cbc	net/http.(*conn).readRequest+0x19c		/opt/go/src/net/http/server.go:966
#	0x779944	net/http.(*conn).serve+0x704			/opt/go/src/net/http/server.go:1858

2 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4e8735 0x4e8717 0x4f1cb7 0x4f1c87 0x545382 0x1372f65 0x474801
#	0x46e774	internal/poll.runtime_pollWait+0x54		/opt/go/src/runtime/netpoll.go:227
#	0x4e7644	internal/poll.(*pollDesc).wait+0x44		/opt/go/src/internal/poll/fd_poll_runtime.go:87
#	0x4e8734	internal/poll.(*pollDesc).waitRead+0x1d4	/opt/go/src/internal/poll/fd_poll_runtime.go:92
#	0x4e8716	internal/poll.(*FD).Read+0x1b6			/opt/go/src/internal/poll/fd_unix.go:166
#	0x4f1cb6	os.(*File).read+0x76				/opt/go/src/os/file_posix.go:31
#	0x4f1c86	os.(*File).Read+0x46				/opt/go/src/os/file.go:117
#	0x545381	bufio.(*Reader).Read+0x221			/opt/go/src/bufio/bufio.go:227
#	0x1372f64	github.com/hashicorp/go-plugin.copyChan+0xc4	/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/grpc_stdio.go:181

1 @ 0x40e414 0x470f85 0x7c73e5 0x474801
#	0x470f84	os/signal.signal_recv+0xa4	/opt/go/src/runtime/sigqueue.go:168
#	0x7c73e4	os/signal.loop+0x24		/opt/go/src/os/signal/signal_unix.go:23

1 @ 0x43d0c5 0x40854f 0x40818b 0x1378d29 0x474801
#	0x1378d28	github.com/hashicorp/go-plugin.Serve.func3+0x88	/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/server.go:433

1 @ 0x43d0c5 0x40854f 0x40818b 0x1478ed8 0x13f6696 0xbd1658 0xbd2d28 0xbe0deb 0x474801
#	0x1478ed7	github.com/hashicorp/nomad/plugins/drivers.(*driverPluginServer).TaskEvents+0x97	/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/nomad/plugins/drivers/server.go:358
#	0x13f6695	github.com/hashicorp/nomad/plugins/drivers/proto._Driver_TaskEvents_Handler+0x115	/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/nomad/plugins/drivers/proto/driver.pb.go:4530
#	0xbd1657	google.golang.org/grpc.(*Server).processStreamingRPC+0xcd7				/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:1329
#	0xbd2d27	google.golang.org/grpc.(*Server).handleStream+0xc67					/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:1409
#	0xbe0dea	google.golang.org/grpc.(*Server).serveStreams.func1.1+0xaa				/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:746

1 @ 0x43d0c5 0x40854f 0x40818b 0x855dc9 0x474801
#	0x855dc8	github.com/actiontech/dtle/g.DumpLoop+0xa8	/universe/src/github.com/actiontech/dtle/g/dump.go:21

1 @ 0x43d0c5 0x40854f 0x40818b 0x856c2f 0x474801
#	0x856c2e	github.com/actiontech/dtle/g.FreeMemoryWorker+0x4e	/universe/src/github.com/actiontech/dtle/g/g.go:93

1 @ 0x43d0c5 0x40854f 0x40818b 0x85706b 0x474801
#	0x85706a	github.com/actiontech/dtle/g.MemoryMonitor+0x6a	/universe/src/github.com/actiontech/dtle/g/g.go:142

1 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4e8735 0x4e8717 0x651eaf 0x666c31 0x545382 0x4bb1a7 0xb51d69 0xb51d02 0xb525e5 0xb8bc3b 0xbcd282 0xbe0cff 0x474801
#	0x46e774	internal/poll.runtime_pollWait+0x54						/opt/go/src/runtime/netpoll.go:227
#	0x4e7644	internal/poll.(*pollDesc).wait+0x44						/opt/go/src/internal/poll/fd_poll_runtime.go:87
#	0x4e8734	internal/poll.(*pollDesc).waitRead+0x1d4					/opt/go/src/internal/poll/fd_poll_runtime.go:92
#	0x4e8716	internal/poll.(*FD).Read+0x1b6							/opt/go/src/internal/poll/fd_unix.go:166
#	0x651eae	net.(*netFD).Read+0x4e								/opt/go/src/net/fd_posix.go:55
#	0x666c30	net.(*conn).Read+0x90								/opt/go/src/net/net.go:183
#	0x545381	bufio.(*Reader).Read+0x221							/opt/go/src/bufio/bufio.go:227
#	0x4bb1a6	io.ReadAtLeast+0x86								/opt/go/src/io/io.go:328
#	0xb51d68	io.ReadFull+0x88								/opt/go/src/io/io.go:347
#	0xb51d01	golang.org/x/net/http2.readFrameHeader+0x21					/universe/src/github.com/actiontech/dtle/vendor/golang.org/x/net/http2/frame.go:237
#	0xb525e4	golang.org/x/net/http2.(*Framer).ReadFrame+0xa4					/universe/src/github.com/actiontech/dtle/vendor/golang.org/x/net/http2/frame.go:498
#	0xb8bc3a	google.golang.org/grpc/internal/transport.(*http2Server).HandleStreams+0x9a	/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/http2_server.go:453
#	0xbcd281	google.golang.org/grpc.(*Server).serveStreams+0xe1				/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:742
#	0xbe0cfe	google.golang.org/grpc.(*Server).handleRawConn.func1+0x3e			/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:703

1 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4e8735 0x4e8717 0x651eaf 0x666c31 0x773298 0x474801
#	0x46e774	internal/poll.runtime_pollWait+0x54		/opt/go/src/runtime/netpoll.go:227
#	0x4e7644	internal/poll.(*pollDesc).wait+0x44		/opt/go/src/internal/poll/fd_poll_runtime.go:87
#	0x4e8734	internal/poll.(*pollDesc).waitRead+0x1d4	/opt/go/src/internal/poll/fd_poll_runtime.go:92
#	0x4e8716	internal/poll.(*FD).Read+0x1b6			/opt/go/src/internal/poll/fd_unix.go:166
#	0x651eae	net.(*netFD).Read+0x4e				/opt/go/src/net/fd_posix.go:55
#	0x666c30	net.(*conn).Read+0x90				/opt/go/src/net/net.go:183
#	0x773297	net/http.(*connReader).backgroundRead+0x57	/opt/go/src/net/http/server.go:672

1 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4ea4b2 0x4ea494 0x6534e5 0x673172 0x671dc5 0x25d6c8f 0x77ea85 0x25d4853 0x2916c95 0x474801
#	0x46e774	internal/poll.runtime_pollWait+0x54					/opt/go/src/runtime/netpoll.go:227
#	0x4e7644	internal/poll.(*pollDesc).wait+0x44					/opt/go/src/internal/poll/fd_poll_runtime.go:87
#	0x4ea4b1	internal/poll.(*pollDesc).waitRead+0x211				/opt/go/src/internal/poll/fd_poll_runtime.go:92
#	0x4ea493	internal/poll.(*FD).Accept+0x1f3					/opt/go/src/internal/poll/fd_unix.go:401
#	0x6534e4	net.(*netFD).accept+0x44						/opt/go/src/net/fd_unix.go:172
#	0x673171	net.(*TCPListener).accept+0x31						/opt/go/src/net/tcpsock_posix.go:139
#	0x671dc4	net.(*TCPListener).AcceptTCP+0x64					/opt/go/src/net/tcpsock.go:248
#	0x25d6c8e	github.com/labstack/echo/v4.tcpKeepAliveListener.Accept+0x2e		/universe/src/github.com/actiontech/dtle/vendor/github.com/labstack/echo/v4/echo.go:963
#	0x77ea84	net/http.(*Server).Serve+0x284						/opt/go/src/net/http/server.go:2961
#	0x25d4852	github.com/labstack/echo/v4.(*Echo).Start+0xf2				/universe/src/github.com/actiontech/dtle/vendor/github.com/labstack/echo/v4/echo.go:671
#	0x2916c94	github.com/actiontech/dtle/drivers/api.SetupApiServer.func3+0x74	/universe/src/github.com/actiontech/dtle/drivers/api/route.go:179

1 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4ea4b2 0x4ea494 0x6534e5 0x673172 0x671f65 0x16bc1a2 0x474801
#	0x46e774	internal/poll.runtime_pollWait+0x54						/opt/go/src/runtime/netpoll.go:227
#	0x4e7644	internal/poll.(*pollDesc).wait+0x44						/opt/go/src/internal/poll/fd_poll_runtime.go:87
#	0x4ea4b1	internal/poll.(*pollDesc).waitRead+0x211					/opt/go/src/internal/poll/fd_poll_runtime.go:92
#	0x4ea493	internal/poll.(*FD).Accept+0x1f3						/opt/go/src/internal/poll/fd_unix.go:401
#	0x6534e4	net.(*netFD).accept+0x44							/opt/go/src/net/fd_unix.go:172
#	0x673171	net.(*TCPListener).accept+0x31							/opt/go/src/net/tcpsock_posix.go:139
#	0x671f64	net.(*TCPListener).Accept+0x64							/opt/go/src/net/tcpsock.go:261
#	0x16bc1a1	github.com/nats-io/nats-server/v2/server.(*Server).acceptConnections+0x41	/universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/server.go:2088

1 @ 0x43d0c5 0x435331 0x46e775 0x4e7645 0x4ea4b2 0x4ea494 0x6534e5 0x6797f2 0x677a45 0xbcc07f 0x137298e 0x474801
#	0x46e774	internal/poll.runtime_pollWait+0x54			/opt/go/src/runtime/netpoll.go:227
#	0x4e7644	internal/poll.(*pollDesc).wait+0x44			/opt/go/src/internal/poll/fd_poll_runtime.go:87
#	0x4ea4b1	internal/poll.(*pollDesc).waitRead+0x211		/opt/go/src/internal/poll/fd_poll_runtime.go:92
#	0x4ea493	internal/poll.(*FD).Accept+0x1f3			/opt/go/src/internal/poll/fd_unix.go:401
#	0x6534e4	net.(*netFD).accept+0x44				/opt/go/src/net/fd_unix.go:172
#	0x6797f1	net.(*UnixListener).accept+0x31				/opt/go/src/net/unixsock_posix.go:162
#	0x677a44	net.(*UnixListener).Accept+0x64				/opt/go/src/net/unixsock.go:260
#	0xbcc07e	google.golang.org/grpc.(*Server).Serve+0x27e		/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:621
#	0x137298d	github.com/hashicorp/go-plugin.(*GRPCServer).Serve+0x6d	/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/grpc_server.go:138

1 @ 0x43d0c5 0x44e2d7 0x1370686 0x137141a 0x474801
#	0x1370685	github.com/hashicorp/go-plugin.(*gRPCBrokerServer).Recv+0x85	/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/grpc_broker.go:121
#	0x1371419	github.com/hashicorp/go-plugin.(*GRPCBroker).Run+0x79		/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/grpc_broker.go:411

1 @ 0x43d0c5 0x44e2d7 0x1372d65 0x135def3 0xbd1658 0xbd2d28 0xbe0deb 0x474801
#	0x1372d64	github.com/hashicorp/go-plugin.(*grpcStdioServer).StreamStdio+0x144			/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/grpc_stdio.go:58
#	0x135def2	github.com/hashicorp/go-plugin/internal/plugin._GRPCStdio_StreamStdio_Handler+0x112	/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/internal/plugin/grpc_stdio.pb.go:185
#	0xbd1657	google.golang.org/grpc.(*Server).processStreamingRPC+0xcd7				/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:1329
#	0xbd2d27	google.golang.org/grpc.(*Server).handleStream+0xc67					/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:1409
#	0xbe0dea	google.golang.org/grpc.(*Server).serveStreams.func1.1+0xaa				/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:746

1 @ 0x43d0c5 0x44e2d7 0x13769d4 0x147631f 0x2917f48 0x2917dde 0x2918237 0x43cc96 0x474801
#	0x13769d3	github.com/hashicorp/go-plugin.Serve+0x953		/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/server.go:468
#	0x147631e	github.com/hashicorp/nomad/plugins/drivers.Serve+0x23e	/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/nomad/plugins/drivers/plugin.go:53
#	0x2917f47	github.com/hashicorp/nomad/plugins.serve+0x147		/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/nomad/plugins/serve.go:47
#	0x2917ddd	github.com/hashicorp/nomad/plugins.Serve+0xbd		/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/nomad/plugins/serve.go:26
#	0x2918236	main.main+0x2d6						/universe/src/github.com/actiontech/dtle/cmd/nomad-plugin/main.go:30
#	0x43cc95	runtime.main+0x255					/opt/go/src/runtime/proc.go:225

1 @ 0x43d0c5 0x44e2d7 0x13787cf 0x474801
#	0x13787ce	github.com/hashicorp/go-plugin.(*gRPCBrokerServer).StartStream.func1+0x10e	/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/grpc_broker.go:70

1 @ 0x43d0c5 0x44e2d7 0x1476b65 0x13f52d3 0xbd1658 0xbd2d28 0xbe0deb 0x474801
#	0x1476b64	github.com/hashicorp/nomad/plugins/drivers.(*driverPluginServer).Fingerprint+0x124	/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/nomad/plugins/drivers/server.go:75
#	0x13f52d2	github.com/hashicorp/nomad/plugins/drivers/proto._Driver_Fingerprint_Handler+0x112	/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/nomad/plugins/drivers/proto/driver.pb.go:4380
#	0xbd1657	google.golang.org/grpc.(*Server).processStreamingRPC+0xcd7				/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:1329
#	0xbd2d27	google.golang.org/grpc.(*Server).handleStream+0xc67					/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:1409
#	0xbe0dea	google.golang.org/grpc.(*Server).serveStreams.func1.1+0xaa				/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:746

1 @ 0x43d0c5 0x44e2d7 0x148094d 0x474801
#	0x148094c	github.com/hashicorp/nomad/drivers/shared/eventer.(*Eventer).eventLoop+0xec	/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/nomad/drivers/shared/eventer/eventer.go:66

1 @ 0x43d0c5 0x44e2d7 0x153f6d7 0x474801
#	0x153f6d6	github.com/nats-io/nats-server/v2/server.(*Server).internalSendLoop+0x116	/universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/events.go:319

1 @ 0x43d0c5 0x44e2d7 0x16afec5 0x474801
#	0x16afec4	github.com/nats-io/nats-server/v2/server.(*sendq).internalLoop+0x464	/universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/sendq.go:54

1 @ 0x43d0c5 0x44e2d7 0x16c833c 0x474801
#	0x16c833b	github.com/nats-io/nats-server/v2/server.(*srvIPQueueLogger).run+0x11b	/universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/server.go:3645

1 @ 0x43d0c5 0x44e2d7 0x17008c5 0x474801
#	0x17008c4	github.com/nats-io/nats-server/v2/server.(*Server).startGWReplyMapExpiration.func1+0x124	/universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/gateway.go:3143

1 @ 0x43d0c5 0x44e2d7 0x1710389 0x474801
#	0x1710388	github.com/nats-io/nats-server/v2/server.(*Server).handleSignals.func1+0xa8	/universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-server/v2/server/signal.go:48

1 @ 0x43d0c5 0x44e2d7 0x18459ef 0x474801
#	0x18459ee	github.com/nats-io/nats-streaming-server/server.(*StanServer).ioLoop+0x74e	/universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-streaming-server/server/server.go:4379

1 @ 0x43d0c5 0x44e2d7 0x184bf85 0x474801
#	0x184bf84	github.com/nats-io/nats-streaming-server/server.(*StanServer).processSubscriptionsStart+0xe4	/universe/src/github.com/actiontech/dtle/vendor/github.com/nats-io/nats-streaming-server/server/server.go:5393

1 @ 0x43d0c5 0x44e2d7 0x242718d 0x474801
#	0x242718c	go.opencensus.io/stats/view.(*worker).start+0xcc	/universe/src/github.com/actiontech/dtle/vendor/go.opencensus.io/stats/view/worker.go:276

1 @ 0x43d0c5 0x44e2d7 0x27e325d 0x474801
#	0x27e325c	github.com/actiontech/dtle/drivers/mysql.(*Driver).handleFingerprint+0x13c	/universe/src/github.com/actiontech/dtle/drivers/mysql/driver.go:404

1 @ 0x43d0c5 0x44e2d7 0xb76eff 0xb7785d 0xb9a017 0x474801
#	0xb76efe	google.golang.org/grpc/internal/transport.(*controlBuffer).get+0xfe	/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/controlbuf.go:395
#	0xb7785c	google.golang.org/grpc/internal/transport.(*loopyWriter).run+0x1dc	/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/controlbuf.go:513
#	0xb9a016	google.golang.org/grpc/internal/transport.newHTTP2Server.func2+0xd6	/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/http2_server.go:291

1 @ 0x43d0c5 0x44e2d7 0xb90145 0x474801
#	0xb90144	google.golang.org/grpc/internal/transport.(*http2Server).keepalive+0x264	/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/http2_server.go:966

1 @ 0x43d0c5 0x44e2d7 0xb960c5 0xb95f91 0xb96ff5 0x4bb1a7 0xb96f32 0xb96eef 0xbc8523 0xbc914d 0xbc9a5b 0xbdd856 0x135ce82 0x13701ea 0x135cd4d 0xbd1658 0xbd2d28 0xbe0deb 0x474801
#	0xb960c4	google.golang.org/grpc/internal/transport.(*recvBufferReader).read+0xa4			/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/transport.go:174
#	0xb95f90	google.golang.org/grpc/internal/transport.(*recvBufferReader).Read+0x210		/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/transport.go:168
#	0xb96ff4	google.golang.org/grpc/internal/transport.(*transportReader).Read+0x54			/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/transport.go:479
#	0x4bb1a6	io.ReadAtLeast+0x86									/opt/go/src/io/io.go:328
#	0xb96f31	io.ReadFull+0xd1									/opt/go/src/io/io.go:347
#	0xb96eee	google.golang.org/grpc/internal/transport.(*Stream).Read+0x8e				/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/internal/transport/transport.go:463
#	0xbc8522	google.golang.org/grpc.(*parser).recvMsg+0x62						/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/rpc_util.go:510
#	0xbc914c	google.golang.org/grpc.recvAndDecompress+0x4c						/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/rpc_util.go:641
#	0xbc9a5a	google.golang.org/grpc.recv+0x9a							/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/rpc_util.go:709
#	0xbdd855	google.golang.org/grpc.(*serverStream).RecvMsg+0x155					/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/stream.go:1473
#	0x135ce81	github.com/hashicorp/go-plugin/internal/plugin.(*gRPCBrokerStartStreamServer).Recv+0x61	/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/internal/plugin/grpc_broker.pb.go:167
#	0x13701e9	github.com/hashicorp/go-plugin.(*gRPCBrokerServer).StartStream+0x109			/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/grpc_broker.go:84
#	0x135cd4c	github.com/hashicorp/go-plugin/internal/plugin._GRPCBroker_StartStream_Handler+0xac	/universe/src/github.com/actiontech/dtle/vendor/github.com/hashicorp/go-plugin/internal/plugin/grpc_broker.pb.go:148
#	0xbd1657	google.golang.org/grpc.(*Server).processStreamingRPC+0xcd7				/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:1329
#	0xbd2d27	google.golang.org/grpc.(*Server).handleStream+0xc67					/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:1409
#	0xbe0dea	google.golang.org/grpc.(*Server).serveStreams.func1.1+0xaa				/universe/src/github.com/actiontech/dtle/vendor/google.golang.org/grpc/server.go:746

1 @ 0x43d0c5 0x471832 0x101eb8f 0x474801
#	0x471831	time.Sleep+0xd1							/opt/go/src/runtime/time.go:193
#	0x101eb8e	github.com/armon/go-metrics.(*Metrics).collectStats+0x2e	/universe/src/github.com/actiontech/dtle/vendor/github.com/armon/go-metrics/metrics.go:230

1 @ 0x46e31d 0x7b644e 0x7b6225 0x7b2db2 0x7c47c5 0x7c6017 0x77b024 0x77cead 0x25dc37f 0x25dc102 0x25d44bc 0x77e6a3 0x779b0d 0x474801
#	0x46e31c	runtime/pprof.runtime_goroutineProfileWithLabels+0x5c	/opt/go/src/runtime/mprof.go:716
#	0x7b644d	runtime/pprof.writeRuntimeProfile+0xcd			/opt/go/src/runtime/pprof/pprof.go:724
#	0x7b6224	runtime/pprof.writeGoroutine+0xa4			/opt/go/src/runtime/pprof/pprof.go:684
#	0x7b2db1	runtime/pprof.(*Profile).WriteTo+0x3f1			/opt/go/src/runtime/pprof/pprof.go:331
#	0x7c47c4	net/http/pprof.handler.ServeHTTP+0x384			/opt/go/src/net/http/pprof/pprof.go:253
#	0x7c6016	net/http/pprof.Index+0x8d6				/opt/go/src/net/http/pprof/pprof.go:371
#	0x77b023	net/http.HandlerFunc.ServeHTTP+0x43			/opt/go/src/net/http/server.go:2049
#	0x77ceac	net/http.(*ServeMux).ServeHTTP+0x1ac			/opt/go/src/net/http/server.go:2428
#	0x25dc37e	github.com/labstack/echo/v4.WrapHandler.func1+0x9e	/universe/src/github.com/actiontech/dtle/vendor/github.com/labstack/echo/v4/echo.go:901
#	0x25dc101	github.com/labstack/echo/v4.(*Echo).add.func1+0x61	/universe/src/github.com/actiontech/dtle/vendor/github.com/labstack/echo/v4/echo.go:544
#	0x25d44bb	github.com/labstack/echo/v4.(*Echo).ServeHTTP+0x2bb	/universe/src/github.com/actiontech/dtle/vendor/github.com/labstack/echo/v4/echo.go:654
#	0x77e6a2	net/http.serverHandler.ServeHTTP+0xa2			/opt/go/src/net/http/server.go:2867
#	0x779b0c	net/http.(*conn).serve+0x8cc				/opt/go/src/net/http/server.go:1932

asiroliu avatar Mar 24 '22 09:03 asiroliu

这个issue复现失败了

  1. 根据issue的步骤,插入的sql是4M的能够直接同步
  2. 我自己手动搞了个8G的事务,也同步成功了
drop table if exists big_tx;
create table if not exists big_tx (id int primary key auto_increment, val longtext);
set @a = repeat('a', 1*1024*1024);

DELIMITER //
CREATE PROCEDURE repeatCourse()
BEGIN
DECLARE num INT;
SET num = 1;
set @a = repeat('a', 1*1024*1024);
WHILE num < 2048 DO
	insert into big_tx values (0, @a);
	insert into big_tx values (0, @a);
SET num = num + 1;
END WHILE;
END //

CALL repeatCourse();

从日志看来似乎也没问题,看看有没有重现的步骤

LordofAvernus avatar Apr 01 '22 12:04 LordofAvernus

已知

  • 每行1MB
  • 2048行, 共2GB
  • 一次select完&发送&执行

分析日志得

# src: select 完毕
2022-08-30T12:39:06.073+0800 [INFO]  client.driver_mgr.dtle: nRows == 0. dump finished.: driver=dtle @module=dumper chunkSize=100000 job=big_tx_1_mb-migration nRows=0 timestamp=2022-08-30T12:39:06.073+0800
# src: 1.5小时后发送消息, 消息大小2.6MB (考虑压缩)
2022-08-30T14:16:12.419+0800 [DEBUG] client.driver_mgr.dtle: publish. msg: driver=dtle gno=0 msgLen=2.674695e+06 spanLen=2.674695e+06 nSeg=1 subject=big_tx_1_mb-migration_full @module=dtle.extractor job=big_tx_1_mb-migration timestamp=2022-08-30T14:16:12.419+0800
# dst: 收到消息
2022-08-30T14:16:43.314+0800 [DEBUG] client.driver_mgr.dtle: NatsMsgMerger.Handle found ordinary msg: driver=dtle nmm=full @module=dtle.applier job=big_tx_1_mb-migration lenData=2.674695e+06 timestamp=2022-08-30T14:16:43.313+0800
# src: 发送_full_complete信号
2022-08-30T14:16:43.320+0800 [DEBUG] client.driver_mgr.dtle: publish: driver=dtle subject=big_tx_1_mb-migration_full_complete gno=0 job=big_tx_1_mb-migration @module=dtle.extractor iSeg=0 partLen=133 timestamp=2022-08-30T14:16:43.320+0800
# dst: 收到_full_complete
2022-08-30T14:16:43.321+0800 [DEBUG] client.driver_mgr.dtle: recv _full_complete.: driver=dtle @module=dtle.applier job=big_tx_1_mb-migration timestamp=2022-08-30T14:16:43.320+0800
# dst: nDumpEntry is not zero应该是1秒1次, 但不知道为什么, 后来没了
2022-08-30T14:16:43.321+0800 [DEBUG] client.driver_mgr.dtle: nDumpEntry is not zero, waiting: driver=dtle job=big_tx_1_mb-migration nDumpEntry=1 @module=dtle.applier timestamp=2022-08-30T14:16:43.320+0800
...
2022-08-30T14:18:57.198+0800 [DEBUG] client.driver_mgr.dtle: nDumpEntry is not zero, waiting: driver=dtle nDumpEntry=1 @module=dtle.applier job=big_tx_1_mb-migration timestamp=2022-08-30T14:18:57.198+0800
# src: err: nats: connection closed, 应该是_full_complete subject上的报错
2022-08-30T14:25:17.773+0800 [ERROR] client.driver_mgr.dtle: unexpected error on publish: driver=dtle @module=dtle.extractor err="nats: connection closed" job=big_tx_1_mb-migration timestamp=2022-08-30T14:25:17.772+0800
# src: 重启
2022-08-30T14:25:35.480+0800 [INFO]  client.driver_mgr.dtle: StartTask: driver=dtle jobName=big_tx_1_mb-migration taskName=src @module=dtle ID=f9fb762f-229d-f48e-9a7d-faf554778315/src/c2d55da7 allocID=f9fb762f-229d-f48e-9a7d-faf554778315 timestamp=2022-08-30T14:25:35.480+0800
# dst: 跟随重启
2022-08-30T14:39:57.346+0800 [ERROR] client.driver_mgr.dtle: onError: driver=dtle job=big_tx_1_mb-migration @module=dtle.applier err="DstPutNats: NatsAddr changed to wait. will restart dst" timestamp=2022-08-30T14:39:57.346+0800

问题

  • select完毕到发送消息, 时间间隔1.5h, 是否正常?
  • nDumpEntry is not zero, waiting为何不打印了?
    • waiting结束的路径有两种, shutdown或Rows copy complete, 没有这些迹象.
  • no binlogEntry for 10s的打印也中断了

ghost avatar Sep 01 '22 03:09 ghost

潜在优化点

全量select数据量大时, 分块发送.

ghost avatar Sep 01 '22 03:09 ghost

测试机内存情况

$ free -h
              total        used        free      shared  buff/cache   available
Mem:           251G        219G        4.4G         15G         28G         15G
Swap:          8.0G        6.9G        1.1G

使用一个每次分配、填充128MB, 共分配16个的go程序, 出现了分配困难. 此时该程序CPU占用100%.

2022/09/02 17:56:16 alloc. before
2022/09/02 17:56:16 make. 0
2022/09/02 17:56:17 fill. 0
2022/09/02 17:56:17 make. 1
2022/09/02 17:56:17 fill. 1
2022/09/02 17:56:17 make. 2
2022/09/02 17:56:17 fill. 2
2022/09/02 17:56:17 make. 3
2022/09/02 17:59:41 fill. 3
2022/09/02 17:59:41 make. 4

将每次分配数量下调至4MB、2MB, 则一定程度上能继续分配.

ghost avatar Sep 02 '22 10:09 ghost

411d21d2: 增量(大事务)也使用该参数进行分割

ghost avatar Sep 07 '22 08:09 ghost