Insert into ClickHouse table with DateTime64 data type using character type time, the value is 8 hours more than the original value
Issue description
Hello!
Describe the unexpected behaviour When the time data of timestamp(n) or datetime(n) is converted into character format using Golang , inserted into DateTime64(n) of ClickHouse, the inserted data is 8 hours longer,n>0。
ClickHouse timezone is Asia/Shanghai。 ClickHouse Version is 22.6.2.12 and 22.6.3.35,they are upgraded from 19.x。
What's causing this?
Example code
import (
"context"
"fmt"
_ "github.com/ClickHouse/clickhouse-go"
"github.com/jmoiron/sqlx"
"log"
"time"
)
............
tx, err := allclient.Begin()
if err != nil {
log.Println("xxx", err)
}
log.Println("start Prepare")
stmt, err := tx.Prepare("INSERT INTO test(id,cretime,cretime1,cretime2,cretime3) VALUES (?,?,?,?,?)")
if err != nil {
log.Println("xxx2", err)
}
if _, err := stmt.Exec(
23,
"2022-07-20 17:42:48",
"2022-07-20 17:42:48",
"2022-07-20 17:42:48.129",
"2022-07-20 17:42:48.129",
); err != nil {
log.Fatal("2", err)
}
if err := tx.Commit(); err != nil {
log.Fatal("3", err)
}
Configuration
Driver version: <=1.5.4
Go version: 1.18 or 1.16
ClickHouse Server version: 22.6.3.35
Table Structure:
CREATE TABLE test
(
`id` Int64,
`cretime` Nullable(DateTime),
`cretime1` Nullable(DateTime),
`cretime2` Nullable(DateTime64(3)),
`cretime3` Nullable(DateTime64(3)),
)
ENGINE = ReplacingMergeTree
ORDER BY id
SETTINGS index_granularity = 8192;
The result:
[root@root ~]# clickhouse-client --user test --password
ClickHouse client version 22.6.3.35 (official build).
Password for user (test):
Connecting to localhost:9000 as user test.
Connected to ClickHouse server version 22.6.3 revision 54455.
root :) use test;
USE test
Query id: ba395487-6ee0-4f65-8ecf-aa01162a0f4f
Ok.
0 rows in set. Elapsed: 0.007 sec.
root :) select * from test where id = 23;
SELECT *
FROM test
WHERE id = 23
Query id: bcb6a4f8-8f52-47ac-b1ce-4b98a99cd539
┌─id─┬─────────────cretime─┬────────────cretime1─┬────────────────cretime2─┬────────────────cretime3─┐
│ 23 │ 2022-07-20 17:42:48 │ 2022-07-20 17:42:48 │ 2022-07-21 01:42:48.129 │ 2022-07-21 01:42:48.129 │
└────┴─────────────────────┴─────────────────────┴─────────────────────────┴─────────────────────────┘
1 row in set. Elapsed: 0.012 sec.
The now() and timezone() result:
root:) select now(),timezone();
SELECT
now(),
timezone()
Query id: 36ab82ce-5048-4a90-814c-22f8efaf024d
┌───────────────now()─┬─timezone()────┐
│ 2022-07-21 14:44:48 │ Asia/Shanghai │
└─────────────────────┴───────────────┘
1 row in set. Elapsed: 0.005 sec.
The timezone in config is:
[root@root ~]# cat /etc/clickhouse-server/config.xml|grep timezone
<timezone>Asia/Shanghai</timezone>
The server timezone:
[root@root~]# timedatectl
Local time: Thu 2022-07-21 14:46:27 CST
Universal time: Thu 2022-07-21 06:46:27 UTC
RTC time: Thu 2022-07-21 06:46:27
Time zone: Asia/Shanghai (CST, +0800)
NTP enabled: no
NTP synchronized: yes
RTC in local TZ: no
DST active: n/a
I'll put a test together for this to illustrate but I don't think clickhouse-go is considering the Locale of the column itself if its not explicitly set in the column defn.
From https://clickhouse.com/docs/en/sql-reference/data-types/datetime
You can explicitly set a time zone for DateTime-type columns when creating a table. Example: DateTime('UTC'). If the time zone isn’t set, ClickHouse uses the value of the timezone parameter in the server settings or the operating system settings at the moment of the ClickHouse server start.
Now if you set this in the column I suspect the behaviour will be as expected. I'll look if we can possibly read the tz here.
Ok, behaviour varies but v2 is correct I believe. v1 is broken but not as the issue suggests. This indeed seems to differ from ClickHouse-client.
The driver is parsing the string and sending as an epoch uint64. During encoding locale is not considered. The locale is either set on the server or in the column. In the following example, I have <timezone>Asia/Shanghai</timezone> on the server.
ClickHouse Version: 22.6.1.1985
Schema
CREATE TABLE test_date (
Id Int64,
Col1 DateTime,
Col2 DateTime('UTC'),
Col3 DateTime64(3),
Col4 DateTime64(3, 'UTC')
) Engine Memory
v1
func Test690(t *testing.T) {
conn, err := sql.Open("clickhouse", "clickhouse://127.0.0.1:9000")
require.NoError(t, err)
const ddl = `
CREATE TABLE test_date (
Id Int64,
Col1 DateTime,
Col2 DateTime('UTC'),
Col3 DateTime64(3),
Col4 DateTime64(3, 'UTC')
) Engine Memory
`
conn.Exec("DROP TABLE test_date")
_, err = conn.Exec(ddl)
require.NoError(t, err)
scope, err := conn.Begin()
require.NoError(t, err)
batch, err := scope.Prepare("INSERT INTO test_date")
require.NoError(t, err)
_, err = batch.Exec(
int64(23),
"2022-07-20 17:42:48.129",
"2022-07-20 17:42:48.129",
"2022-07-20 17:42:48.129",
"2022-07-20 17:42:48.129",
)
require.NoError(t, err)
require.NoError(t, scope.Commit())
}
DateTime64 is parsed with the pattern 2006-01-02 15:04:05.999 and it converts the uint64 epoch independent of the timezone. DateTime uses 2006-01-02 15:04:05 but critically it assumes Local timezone i.e. the client's timezone.
My local time is -1 UTC.
Reading this data with ClickHouse client, I get
SELECT *
FROM test_date
Query id: 57129d1c-e666-455e-b49c-8dccdd730204
┌─Id─┬────────────────Col1─┬────────────────Col2─┬────────────────────Col3─┬────────────────────Col4─┐
│ 23 │ 2022-07-21 00:42:48 │ 2022-07-20 16:42:48 │ 2022-07-21 01:42:48.129 │ 2022-07-20 17:42:48.129 │
└────┴─────────────────────┴─────────────────────┴─────────────────────────┴─────────────────────────┘
Col4 - is correct. This spec says epoch internally and its rendered according to UTC - the value of 1658338968129
Col3 - It is using the server's timezone and rendering in Shanhai time. i think this is also correct.
Col2 - The bug in adjusting to local has caused the date to be inserted incorrectly. Hence we loose an hr.
Col1 - same as Col2 - server is adjusting but insertion lost an hr.
v2
v2 doesn't support sending a string for DateTime yet. We need to close this gap. Insert code is therefore simpler.
func Test690(t *testing.T) {
conn, err := sql.Open("clickhouse", "clickhouse://127.0.0.1:9000")
require.NoError(t, err)
if err := std.CheckMinServerVersion(conn, 21, 9, 0); err != nil {
t.Skip(err.Error())
return
}
const ddl = `
CREATE TABLE test_date (
Id Int64,
Col3 DateTime64(3),
Col4 DateTime64(3, 'UTC')
) Engine Memory
`
conn.Exec("DROP TABLE test_date")
_, err = conn.Exec(ddl)
require.NoError(t, err)
scope, err := conn.Begin()
require.NoError(t, err)
batch, err := scope.Prepare("INSERT INTO test_date")
require.NoError(t, err)
_, err = batch.Exec(
int64(23),
"2022-07-20 17:42:48.129",
"2022-07-20 17:42:48.129",
)
require.NoError(t, err)
require.NoError(t, scope.Commit())
}
Behaviour at insert time for DateTime64 is identical for v2 and v1.
From clickhouse-client,
SELECT *
FROM test_date
Query id: 82ef27b8-daf1-4542-b840-b9cd798f228e
┌─Id─┬────────────────────Col3─┬────────────────────Col4─┐
│ 23 │ 2022-07-21 01:42:48.129 │ 2022-07-20 17:42:48.129 │
└────┴─────────────────────────┴─────────────────────────┘
1 row in set. Elapsed: 0.050 sec.
Both of these are correct.
ClickHouse Client
DROP TABLE IF EXISTS test_date;
CREATE TABLE test_date
(
Id Int64,
Col1 DateTime,
Col2 DateTime('UTC'),
Col3 DateTime64(3),
Col4 DateTime64(3, 'UTC')
)
ENGINE = Memory;
INSERT INTO test_date VALUES (23, '2022-07-20 17:42:48', '2022-07-20 17:42:48', '2022-07-20 17:42:48.129', '2022-07-20 17:42:48.129');
SELECT * FROM test_date;
clickhouse :) SELECT * FROM test_date;
SELECT *
FROM test_date
Query id: cd0b1b5f-051d-4171-b80e-16d13fca270d
┌─Id─┬────────────────Col1─┬────────────────Col2─┬────────────────────Col3─┬────────────────────Col4─┐
│ 23 │ 2022-07-20 17:42:48 │ 2022-07-20 17:42:48 │ 2022-07-20 17:42:48.129 │ 2022-07-20 17:42:48.129 │
└────┴─────────────────────┴─────────────────────┴─────────────────────────┴─────────────────────────┘
1 row in set. Elapsed: 0.021 sec.
This seems incorrect to me and I've reached out to the clickhouse-client team to understand the differences here.
Summary
- v1 - is take the local timezone of the client at insert time. This feels like a bug. I'm assuming @zhoufuke your client was on UTC.
- v2 - this is correct, I think.
What time do you expect when specifying 2022-07-20 17:42:48 on insert?
- Server time zone
- Client local time zone
- UTC
What time do you expect when specifying
2022-07-20 17:42:48on insert?
- Server time zone
- Client local time zone
- UTC
I want the Client local time zone.
Thanks.
- My client tz is UTC+8
- I think adding UTC on the column is just a temporary fix. Also, I think the data is presented in the time zone of UTC, which is wrong data.
So currently,
2022-07-20 17:42:48 is inserted as 1658338968129 i.e. UTC. When you read it back, it's rendered in UTC+8. We have a few options here at insert time:
- Use the tz of the string or time.Time{}
- Use the column tz
- Use the server timezone
- Assume UTC - this is what we do now.
- Use the local tz of the client.
Wrt to rendering - we have more options:
- Render according to column tz if available - we do this now.
- Render according to local tz - we do this if (1) isn't available.
- Use the server timezone
We need document and agree defaults and fall backs.
This is further complicated by the fact column TZ info is not returned over HTTP even if using the native format (and the column has a tz) https://github.com/ClickHouse/ClickHouse/issues/38209
Thanks. Currently, using a string column to solve the problem.
@zhoufuke we discussed and agreed on a specification for all clients - https://github.com/ClickHouse/clickhouse-go/issues/719
In summary, we will revert to the tz of the client - consistent with your desired behaviour. Thanks for the patience and for raising the issue - which prompted our discussions.
ok,thanks.