mydumper
mydumper copied to clipboard
[BUG] 0.15.1-3, mydumper, thread no stopping
Describe the bug mydumper has a thread that runs all the time without stopping.
To Reproduce Command executed:
- mydumper with all the parameters
- myloader with all the parameters --set-names=utf8mb4 --use-savepoints --less-locking --skip-tz-utc --complete-insert --trx-consistency-only --hex-blob --skip-definer --no-check-generated-fields --disk-limits=1024:10240 --rows=10000:100000:0 -c ZSTD -v 3 -R -G -E -e -x ^(?!(test.|information_schema.|performance_schema.|sys.)) -O /data/mysqldata/scripts/tablelist.skip -L dump_20230815.log -o 20230815023001
What mydumper and myloader version has been used? version: mydumper v0.15.1-3, built against MySQL 5.7.42-46 with SSL support
Expected behavior
After upgrading to version 0.15.1-3, there is always a thread running every day without stopping. Today is thread 4, and there is not much useful information in the log:
Environment (please complete the following information): OS version: CentOS 7 MyDumper version: mydumper v0.15.1-3, built against MySQL 5.7.42-46 with SSL support
Hi @peng19832
Can you share the table structure of jwell_oms
.t_acct_detail
? which is the latest table that thread 4 is trying to export.
Hi @peng19832
Can you share the table structure of
jwell_oms
.t_acct_detail
? which is the latest table that thread 4 is trying to export.
Hi Boss @davidducos ï¼itâs here:
CREATE TABLE `t_acct_detail` (
`id` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL,
`account_id` varchar(128) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL,
`create_id` bigint(20) NOT NULL,
`create_name` varchar(16) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL,
`create_time` datetime NOT NULL,
`update_id` bigint(20) DEFAULT NULL,
`update_name` varchar(128) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`update_time` datetime DEFAULT NULL,
`status` varchar(8) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL,
`remark` varchar(2000) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`vouch_type` varchar(8) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`vouch_id` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`business_no` varchar(1000) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`detail_no` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`acct_term` varchar(6) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`vouch_date` varchar(8) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`direct_mark` varchar(2) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`debit_loan_mark` varchar(2) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`ar_type` varchar(3) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`vendor_code` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`vendor_name` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`digest` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`acct_title_code` varchar(12) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`currency` varchar(5) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`exchage_rate` decimal(10, 0) DEFAULT NULL,
`debit_amt_f` decimal(15, 2) DEFAULT NULL,
`debit_amt` decimal(15, 2) DEFAULT NULL,
`loan_amt_f` decimal(15, 2) DEFAULT NULL,
`loan_amt` decimal(15, 2) DEFAULT NULL,
`debit_qty` decimal(15, 3) DEFAULT NULL,
`loan_qty` decimal(15, 3) DEFAULT NULL,
`measure_unit` varchar(5) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`debit_wt` decimal(15, 3) DEFAULT NULL,
`loan_wt` decimal(15, 3) DEFAULT NULL,
`each_bill_id` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`use_type_code` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`post_status` varchar(2) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`voucher_date` varchar(8) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`voucher_no` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`company_code` varchar(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`nc_business_type` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`profit_center` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`bill_origin` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`bill_type` varchar(12) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`bill_no` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`from_bill_no` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`contract_no` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`order_num` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`order_item_num` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`xs_order_num` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`xs_order_item_num` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`cg_order_num` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`cg_order_item_num` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`product_code` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`business_type` varchar(4) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`detail_sort` varchar(2) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`show_mark` varchar(12) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`uuid` varchar(12) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`define1` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`define2` varchar(24) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`define3` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`sys_type` varchar(24) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`agreement_no` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`project_no` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`use_type_level` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
`message_id` varchar(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
PRIMARY KEY (`id`) USING BTREE,
KEY `acct_detail_index01` (`bill_no`, `vouch_id`),
KEY `acct_detail_index02` (`debit_loan_mark`) USING BTREE,
KEY `acct_detail_index03` (`each_bill_id`),
KEY `acct_detail_index06` (`account_id`),
KEY `acct_detail_index08` (`vouch_type`),
KEY `acct_detail_index09` (`acct_title_code`),
KEY `acct_detail_index10` (`vendor_code`),
KEY `status` (`status`),
KEY `acct_detail_index05` (`order_num`),
KEY `index_fund_acct_detail_11` (`cg_order_num`),
KEY `acct_detail_index04` (`vouch_type`, `vouch_id`) USING BTREE,
KEY `acct_detail_index07` (`contract_no`),
KEY `use_type_code` (`use_type_code`),
KEY `acct_detail_index20` (`vouch_id`) USING BTREE
) ENGINE = InnoDB DEFAULT CHARSET = utf8mb4 COLLATE = utf8mb4_general_ci ROW_FORMAT = DYNAMIC;
Hi @peng19832,
Honestly, I'm not sure what is the problem. However, I think that I found a workaround. You need to stop using the regex to skip databases, you should use -B to list the databases that you want to backup.
I think that mydumper is trying to export some structure from the sys schema, and it might be failing and it is not leaving a message in the log.
Please, let me know if you can test what I'm proposing as I need to understand where the issue might be.
Hi, Boss @davidducos , Of course it can be tested. You mean, I only use the -B option to export the entire jwell_oms, right? --set-names=utf8mb4 --use-savepoints --less-locking --skip-tz-utc --complete-insert --trx-consistency-only --hex-blob --skip-definer --no-check-generated-fields --disk-limits=1024:10240 --rows=10000:100000:0 -c ZSTD -v 3 -B jwell_oms -R -G -E -e -L dump_20230815.log -o 20230815023001
Hi @peng19832
According to the log, these are your databases:
2023-08-15 02:40:49 [INFO] - Thread 3: dumping db information for `database_name`
2023-08-15 02:40:49 [INFO] - Thread 4: dumping db information for `jplat`
2023-08-15 02:40:49 [INFO] - Thread 2: dumping db information for `jplat_user_center`
2023-08-15 02:40:49 [INFO] - Thread 1: dumping db information for `jwell_oms`
2023-08-15 02:40:49 [INFO] - Thread 3: dumping db information for `jwell_oms_activiti`
2023-08-15 02:40:49 [INFO] - Thread 3: dumping db information for `jwell_oms_app`
2023-08-15 02:40:49 [INFO] - Thread 3: dumping db information for `jwell_oms_base`
2023-08-15 02:40:49 [INFO] - Thread 4: dumping db information for `jwell_oms_cc`
2023-08-15 02:40:49 [INFO] - Thread 2: dumping db information for `jwell_oms_fund`
2023-08-15 02:40:50 [INFO] - Thread 3: dumping db information for `jwell_oms_gd`
2023-08-15 02:40:50 [INFO] - Thread 3: dumping db information for `jwell_oms_mdm`
2023-08-15 02:40:50 [INFO] - Thread 4: dumping db information for `jwell_oms_oca`
2023-08-15 02:40:50 [INFO] - Thread 4: dumping db information for `jwell_oms_oss`
2023-08-15 02:40:50 [INFO] - Thread 4: dumping db information for `jwell_oms_pm`
2023-08-15 02:40:50 [INFO] - Thread 2: dumping db information for `jwell_oms_sd`
2023-08-15 02:40:50 [INFO] - Thread 3: dumping db information for `jwell_oms_sign`
2023-08-15 02:40:50 [INFO] - Thread 3: dumping db information for `jwell_oms_smp`
2023-08-15 02:40:50 [INFO] - Thread 4: dumping db information for `mysql`
2023-08-15 02:40:50 [INFO] - Thread 4: dumping db information for `seata_db`
2023-08-15 02:40:50 [INFO] - Thread 4: dumping db information for `sys`
2023-08-15 02:40:50 [INFO] - Thread 2: dumping db information for `xxl_job`
You should use this parameter instead of --regex
:
-B database_name,jplat,jplat_user_center,jwell_oms,jwell_oms_activiti,jwell_oms_app,jwell_oms_base,jwell_oms_cc,jwell_oms_fund,jwell_oms_gd,jwell_oms_mdm,jwell_oms_oca,jwell_oms_oss,jwell_oms_pm,jwell_oms_sd,jwell_oms_sign,jwell_oms_smp,mysql,seata_db,xxl_job```
Using rpm to install 0.13.1-2 has the same problem. When the amount of data is large enough, the process does not exit
Using rpm to install 0.13.1-2 has the same problem. When the amount of data is large enough, the process does not exit,but It will exit after importing with version 0.10.1
myloader process not exit
@Jolin-blank we have the same issue with the current version; a workaround is to set the threads to 1 through CLI, then it will work.
myloader -t 1
Hi @peng19832
According to the log, these are your databases:
2023-08-15 02:40:49 [INFO] - Thread 3: dumping db information for `database_name` 2023-08-15 02:40:49 [INFO] - Thread 4: dumping db information for `jplat` 2023-08-15 02:40:49 [INFO] - Thread 2: dumping db information for `jplat_user_center` 2023-08-15 02:40:49 [INFO] - Thread 1: dumping db information for `jwell_oms` 2023-08-15 02:40:49 [INFO] - Thread 3: dumping db information for `jwell_oms_activiti` 2023-08-15 02:40:49 [INFO] - Thread 3: dumping db information for `jwell_oms_app` 2023-08-15 02:40:49 [INFO] - Thread 3: dumping db information for `jwell_oms_base` 2023-08-15 02:40:49 [INFO] - Thread 4: dumping db information for `jwell_oms_cc` 2023-08-15 02:40:49 [INFO] - Thread 2: dumping db information for `jwell_oms_fund` 2023-08-15 02:40:50 [INFO] - Thread 3: dumping db information for `jwell_oms_gd` 2023-08-15 02:40:50 [INFO] - Thread 3: dumping db information for `jwell_oms_mdm` 2023-08-15 02:40:50 [INFO] - Thread 4: dumping db information for `jwell_oms_oca` 2023-08-15 02:40:50 [INFO] - Thread 4: dumping db information for `jwell_oms_oss` 2023-08-15 02:40:50 [INFO] - Thread 4: dumping db information for `jwell_oms_pm` 2023-08-15 02:40:50 [INFO] - Thread 2: dumping db information for `jwell_oms_sd` 2023-08-15 02:40:50 [INFO] - Thread 3: dumping db information for `jwell_oms_sign` 2023-08-15 02:40:50 [INFO] - Thread 3: dumping db information for `jwell_oms_smp` 2023-08-15 02:40:50 [INFO] - Thread 4: dumping db information for `mysql` 2023-08-15 02:40:50 [INFO] - Thread 4: dumping db information for `seata_db` 2023-08-15 02:40:50 [INFO] - Thread 4: dumping db information for `sys` 2023-08-15 02:40:50 [INFO] - Thread 2: dumping db information for `xxl_job`
You should use this parameter instead of
--regex
:-B database_name,jplat,jplat_user_center,jwell_oms,jwell_oms_activiti,jwell_oms_app,jwell_oms_base,jwell_oms_cc,jwell_oms_fund,jwell_oms_gd,jwell_oms_mdm,jwell_oms_oca,jwell_oms_oss,jwell_oms_pm,jwell_oms_sd,jwell_oms_sign,jwell_oms_smp,mysql,seata_db,xxl_job```
@davidducos ok, i try try ...
@davidducos
Hi, Boss.
Looks like it's stuck with the view?
dump_202308311547.log
Hi, boss @davidducos :
In addition, when I obtained the copy location from the metadata file and used it in the script to restore the database, I found that a lot of table row information was appended to this file, causing the script to fail. What is the purpose of appending these row information? Can Should these row information be written to other files?
The information on the metadata file helps myloader on different ways. We are not going to remove it or move it to another file, actually, the idea is to give as much information as possible to myloader. Actually, we are already sending partial metadata files when stream is used. If you need the copy location on the metadata file, we could append it.
About the log that you sent, seems that the threads were still running.
Can you try again and execute and share the show processlist
??
I have the same issues with the tool as well, it hang for some reason.
I run 6 parallel instances of:
mydumper -h $prod -u $user-p $pass -o export/ -B data -L $exportLog -v 3 -F 500 --split-partitions -t 1 --stream | myloader -h $test -u $user-p $tpass -o -B data -d import/ -L $importLog -v 3 -t 16 --max-threads-per-table 8 --stream
Three processes have finished normally, the rest three hanged and not processed one of the file [ I split the database for files 500MB each ].
The mydumper downloads files as follow:
the logs of the myloader:
I've noticed that one file data.__entity_body.00000.00227.sql was not processed. When I grep logs only for this table I found:
-
export
-
import
I tried to recover the file which was left by the myloader and it seems it is corrupted. The last line of the file data.__entity_body.00000.00227.sql is cut in the middle of INSERT.
Hi @szuras can you test with the latest prerelease?
@Jolin-blank we have the same issue with the current version; a workaround is to set the threads to 1 through CLI, then it will work.
myloader -t 1
I think multithreaded myloader hang may be related to this:
void *control_job_thread(struct configuration *conf){
...
while(cont){
ft=(enum file_type)GPOINTER_TO_INT(g_async_queue_pop(refresh_db_queue));
switch (ft){
...
case THREAD:
// g_message("Thread is asking for job");
giveup = give_me_next_data_job_conf(conf, TRUE, &rj);
if (rj != NULL){
...
}else{
// g_message("Thread is asking for job again");
giveup = give_any_data_job_conf(conf, &rj);
if (rj != NULL){
...
}else{
...
if (intermediate_queue_ended_local){
if (giveup){
...
}else{ /* NON_SIGNALLING_BRANCH */
// g_message("Ignoring");
// g_async_queue_push(here_is_your_job, GINT_TO_POINTER(IGNORED));
// g_message("Thread waiting");
threads_waiting=threads_waiting<num_threads?threads_waiting+1:num_threads;
}
}else{
...
}
}
}
break;
case INTERMEDIATE_ENDED:
...
} /* switch (ft) */
} /* while (cont) */
g_message("Sending start_innodb_optimize_keys_all_tables");
start_innodb_optimize_keys_all_tables();
return NULL;
} /* control_job_thread() */
void *process_loader_thread(struct thread_data * td) {
...
while (cont){
// control job threads needs to know that I'm ready to receive another job
g_async_queue_push(refresh_db_queue, GINT_TO_POINTER(THREAD));
ft=(enum file_type)GPOINTER_TO_INT(g_async_queue_pop(here_is_your_job));
...
-
process_loader_thread()
signalsrefresh_db_queue
; -
control_job_thread()
goes inside NON_SIGNALLING_BRANCH; -
process_loader_thread()
blocks on pullinghere_is_your_job
; -
control_job_thread()
repeatswhile (cont)
loop and blocks on pullingrefresh_db_queue
;
It’s not a workaround. I would love to upload data simultaneously in multiple threads. I’ve also noticed problem with downloaded data - wrong values in some downloaded files.
On Tue, 7 Nov 2023 at 19:13, Aleksey Midenkov @.***> wrote:
@Jolin-blank https://github.com/Jolin-blank we have the same issue with the current version; a workaround is to set the threads to 1 through CLI, then it will work.
myloader -t 1
I think multithreaded myloader hang may be related to this:
void *control_job_thread(struct configuration *conf){ ... while(cont){ ft=(enum file_type)GPOINTER_TO_INT(g_async_queue_pop(refresh_db_queue)); switch (ft){ ... case THREAD:// g_message("Thread is asking for job"); giveup = give_me_next_data_job_conf(conf, TRUE, &rj); if (rj != NULL){ ... }else{// g_message("Thread is asking for job again"); giveup = give_any_data_job_conf(conf, &rj); if (rj != NULL){ ... }else{ ... if (intermediate_queue_ended_local){ if (giveup){ ... }else{ // g_message("Ignoring"); // g_async_queue_push(here_is_your_job, GINT_TO_POINTER(IGNORED));// g_message("Thread waiting"); threads_waiting=threads_waiting<num_threads?threads_waiting+1:num_threads; } }else{ ... } } } break;
— Reply to this email directly, view it on GitHub https://github.com/mydumper/mydumper/issues/1264#issuecomment-1799383486, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABE7K4HV4E76GMQZKE2GQZLYDJ255AVCNFSM6AAAAAA3QQXH6GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJZGM4DGNBYGY . You are receiving this because you were mentioned.Message ID: @.***>
@szuras Why did you quote my answer? I didn't give any workarounds.
I’ve also noticed problem with downloaded data - wrong values in some downloaded files.
Yes, there are a number of known bugs. Can you give your example which wrongly dumps the data?
When it comes to wrong data. From time to time the exported file contains wrong data, some times I don't see this kind of problems.
The tables structure looks like:
describe __entity_body;
+----------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+-------------+------+-----+---------+-------+
| primname | varchar(64) | NO | PRI | NULL | |
| body | mediumblob | YES | | NULL | |
| last_update | datetime | NO | MUL | NULL | |
| operator_id | int | YES | | NULL | |
| perm_schemas | text | YES | | NULL | |
| aliases | text | YES | | NULL | |
| entity_type_id | smallint | NO | | NULL | |
| state | int | NO | | 0 | |
+----------------+-------------+------+-----+---------+-------+
some downloaded files like data.__entity_body.00000.00227.sql contains data as follow:
INSERT INTO `__entity_body` VALUES(0x79676531356E72706E6366616D667474687262653462,0x1F8B0800000000000000AD56______LONG BLOB VALUE______C12DB90B0000,0x323032332D30332D32302031373A30333A3532,2,0x6573746174,NULL,1,0)
Columns primname/last_update and other are wrong. They contain some blob values instead of varchar and text or datetime.
so the first column suppose to be varchar but the dump contains blob instead. I would expect values like:
body: 0x1F8B0800000000000000AD______LONG BLOB VALUE______89D8CFA0A0000
last_update: 2023-04-07 22:29:39
operator_id: 2
perm_schemas: some_value
aliases: NULL
entity_type_id: 1
state: 0
@szuras Curious! Which version/revision is this?
I'm using 0.15.1-3.
@szuras some changes had been made to myloader on the latest prerelease, can you test it?
Hi @szuras,
Can you test with latest release?