[MySQL] last_record with timestamp column stored wrong timestamp.
- Embulk: 0.8.39
- embulk-input-mysql: 0.8.2
- MySQL: 5.7.19
- Server and Client timezone: JST.
- sample_data: sample.txt
Those data is JST timezone.
mysql> select * from in_mysql_test;
+------+---------------------+-------+
| c0 | c1 | c2 |
+------+---------------------+-------+
| 1 | 2017-12-01 00:00:00 | dummy |
| 2 | 2017-12-01 00:01:00 | dummy |
...
| 1440 | 2017-12-01 23:59:00 | dummy |
| 1441 | 2017-12-02 00:00:00 | dummy |
+------+---------------------+-------+
in:
type: mysql
host: 127.0.0.1
user: {{ env.mysql_user }}
password: {{ env.mysql_password }}
table: in_mysql_test
database: embulk_test
options: { useLegacyDatetimeCode: false }
incremental_columns:
- c1
incremental: true
out:
type: stdout
Execute embulk run -o diff.yml input_mysql.yml.liquid
in:
last_record: ['2017-12-01T06:00:00.000000Z']
out: {}
1439,2017-12-01 14:58:00,dummy
1440,2017-12-01 14:59:00,dummy
1441,2017-12-01 15:00:00,dummy
2017-12-07 17:31:58.364 +0900 [INFO] (0001:transaction): {done: 1 / 1, running: 0}
2017-12-07 17:31:58.370 +0900 [INFO] (main): Committed.
2017-12-07 17:31:58.371 +0900 [INFO] (main): Next config diff: {"in":{"last_record":["2017-12-01T06:00:00.000000Z"]},"out":{}}
The last_record should be 2017-12-01T15:00:00.000000Z but It set 2017-12-01T06:00:00.000000Z
And second execution log
2017-12-07 20:05:33.310 +0900 [INFO] (0018:task-0000): SQL: SELECT * FROM `in_mysql_test` WHERE ((`c1` > ?)) ORDER BY `c1`
2017-12-07 20:05:33.311 +0900 [INFO] (0018:task-0000): Parameters: ["2017-12-01T06:00:00.000000Z"]
Reference
Hello, @ganchiku
Thank you for your comment, Does this mean the following change?
It seems to work well in my environment. @hito4t What do you think this change?
diff --git a/embulk-input-mysql/src/main/java/org/embulk/input/mysql/getter/MySQLTimestampTimestampIncrementalHandler.java b/embulk-input-mysql/src/main/java/org/embulk/input/mysql/getter/MySQLTimestampTimestampIncrementalHandler.java
index f128ee1..9033c8f 100644
--- a/embulk-input-mysql/src/main/java/org/embulk/input/mysql/getter/MySQLTimestampTimestampIncrementalHandler.java
+++ b/embulk-input-mysql/src/main/java/org/embulk/input/mysql/getter/MySQLTimestampTimestampIncrementalHandler.java
@@ -22,8 +22,8 @@ public class MySQLTimestampTimestampIncrementalHandler
@Override
public org.embulk.spi.time.Timestamp utcTimestampFromSessionTime(long epochSecond, int nano)
{
- long sec = sessionTimeZone.convertLocalToUTC(epochSecond * 1000, false) / 1000;
- return org.embulk.spi.time.Timestamp.ofEpochSecond(sec, nano);
+// long sec = sessionTimeZone.convertLocalToUTC(epochSecond * 1000, false) / 1000;
+ return org.embulk.spi.time.Timestamp.ofEpochSecond(epochSecond, nano);
}
@Override
@hito4t
After applying above change, The last_record result is 2017-12-01T15:00:00.000000Z.
It expects value.
I want to verify one thing.
Does this timestamp parameter change to local timezone before query?
(Change 2017-12-01T15:00:00.000000Z -> 2017-12-02T00:00:00+0900)
2017-12-08 18:48:06.486 +0900 [INFO] (0018:task-0000): SQL: SELECT * FROM `in_mysql_test` WHERE ((`c1` > ?)) ORDER BY `c1`
2017-12-08 18:48:06.487 +0900 [INFO] (0018:task-0000): Parameters: ["2017-12-01T15:00:00.000000Z"]
Because when I execute the following SQL, It returns the wrong result. But this plugin returns no data.(correct)
SELECT * FROM `in_mysql_test` WHERE ((`c1` > '2017-12-01T15:00:00.000000Z' )) ORDER BY `c1`;
+------+---------------------+-------+
| c0 | c1 | c2 |
+------+---------------------+-------+
| 902 | 2017-12-01 15:01:00 | dummy |
| 903 | 2017-12-01 15:02:00 | dummy |
| 904 | 2017-12-01 15:03:00 | dummy |
| 905 | 2017-12-01 15:04:00 | dummy |
@hiroyuki-sato
Thank you for your investigation, and I'm sorry that I have deleted my previous comment on this issue. I am still skeptical that my configuration might be bad.
But it was follows:
Thank you for raising this issue. I'm new to Embulk, and now trying to research how to use this software.
I looked into the embulk-input-mysql, and found something strange, and would like to share it.
https://github.com/embulk/embulk-input-jdbc/blob/master/embulk-input-mysql/src/main/java/org/embulk/input/mysql/getter/MySQLTimestampTimestampIncrementalHandler.java#L23
In `utcTimestampFromSessionTime` method, if the MySQL time_zone is JST, both `sessionTimeZone.convertLocalToUTC` and `org.embulk.spi.time.Timestamp.ofEpochSecond` do UTC calculation.
If I removed `sessionTimeZone.convertLocalToUTC` from the method, it works, but I don't know if it is right fix for this problem. I also tried if MySQL time_zone is UTC, and it also works.
I can make PR for the fix, or if you have better idea, please provide how.
Thanks,
I will look into this issue more.
Thank you.
@ganchiku @hiroyuki-sato Thank you for the information!
As you wrote, we shoudn't convert timestamp in utcTimestampFromSessionTime method because the timestamp returned from the server will be already valid.
If you make PR, I'll merge.