tidb-lightning icon indicating copy to clipboard operation
tidb-lightning copied to clipboard

tidb-lightning insert wrong value to boolean(tinyint) type with tidb backend

Open Jingyuma opened this issue 4 years ago • 3 comments

Bug Report

Please answer these questions before submitting your issue. Thanks!

  1. What did you do? If possible, provide a recipe for reproducing the error.
create table t2, the column type is boolean.
mysql>use test;
mysql> show create table t2;
+-------+---------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                  |
+-------+---------------------------------------------------------------------------------------------------------------+
| t2    | CREATE TABLE `t2` (
  `t` tinyint(1) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+-------+---------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

prepare import csv, with true value.
vi test.t2.csv
true
1
false
0

use tidb-lightning import the csv
  1. What did you expect to see?
1
1
0
0
  1. What did you see instead?
0
1
0
0
  1. Versions of the cluster

    • TiDB-Lightning version (run tidb-lightning -V):
[root@10 bin]# ./tidb-lightning -V
Release Version: v4.0.6
Git Commit Hash: 12fe8b151119becc4cac0501806504e720e4e61d
Git Branch: heads/refs/tags/v4.0.6
UTC Build Time: 2020-09-16 03:45:37
Go Version: go version go1.13 linux/amd64
- TiKV-Importer version (run `tikv-importer -V`)
        use tidb backend
- TiKV version (run `tikv-server -V`):
[root@10 bin]# ./tikv-server -V
TiKV
Release Version:   4.0.1
Edition:           Community
Git Commit Hash:   78d7a854026962669ceb2ee0ac343a5e88faa310
Git Commit Branch: heads/refs/tags/v4.0.1
UTC Build Time:    2020-06-12 09:13:05
Rust Version:      rustc 1.42.0-nightly (0de96d37f 2019-12-19)
Enable Features:   jemalloc portable sse protobuf-codec
Profile:           dist_release
- TiDB cluster version (execute `SELECT tidb_version();` in a MySQL client):
| Release Version: v4.0.1
Edition: Community
Git Commit Hash: 689a6b6439ae7835947fcaccf329a3fc303986cb
Git Branch: heads/v4.0.1
UTC Build Time: 2020-06-15 03:54:18
GoVersion: go1.13
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false |
- Other interesting information (system version, hardware config, etc):

    >
    >
  1. Operation logs tidb-lightning.log

  2. Configuration of the cluster and the task

  3. Screenshot/exported-PDF of Grafana dashboard or metrics' graph in Prometheus for TiDB-Lightning if possible When debug the code, I found the statement is as below, but It can't be insert via mysql client. I'm not sure why it can be inserted via mysqlConn.exec.

REPLACE INTO `test`.`t2` VALUES('true');

image image

Jingyuma avatar Sep 29 '20 12:09 Jingyuma

Looks the issue is caused by the default case of switch when running tidbEncoder.appendSQL. image

Jingyuma avatar Sep 29 '20 13:09 Jingyuma

By default Lightning runs in a non-strict SQL_Mode, which would reduce those errors to only warnings. You could reproduce as:

mysql> set sql_mode = '';
Query OK, 0 rows affected (0.02 sec)

mysql> REPLACE INTO `test`.`t2` VALUES('true');
Query OK, 1 row affected, 1 warning (0.01 sec)

mysql> show warnings;
+---------+------+---------------------------------------------------------+
| Level   | Code | Message                                                 |
+---------+------+---------------------------------------------------------+
| Warning | 1366 | Incorrect integer value: 'true' for column 'a' at row 1 |
+---------+------+---------------------------------------------------------+
1 row in set (0.00 sec)

mysql> select * from test.t2;
+------+
| a    |
+------+
|    0 |
+------+
1 row in set (0.00 sec)

CSV does not distinguish between strings and keywords, unlike SQL. In CSV, true and false are always considered as the literal strings 'true' and 'false'. This is also the behavior of MySQL's LOAD DATA. So the behavior may be undesirable but is totally in line with the standard.

We may consider adding keyword specialization for true and false, similar to null, if this is a pressing need.

kennytm avatar Sep 29 '20 14:09 kennytm

Many thanks for you elaboration, learn a lot from you.

By reference to https://dba.stackexchange.com/questions/80727/converting-falsestring-to-falseboolean-during-load-data-local-infile, it propose below solution. This is just for your reference.

LOAD DATA LOCAL INFILE 'file1' INTO table n1 FIELDS TERMINATED BY ',' enclosed by '\"' LINES TERMINATED BY '\n' (@var1,c2) SET c1 = IF(@var='True',1,0);

And if tidb-lightning could't support the boolean field for csv, I suggest to give a workaround such as replace the boolean value with 0 or 1 on official user guide.

Jingyuma avatar Sep 29 '20 14:09 Jingyuma