doris
doris copied to clipboard
[Bug] For Stream load, partial column import is used, and the default value for the 'current_timestamp' column is the table creation date.
I also find this problem when I load data into Doris.
- Create table in doris to test like this:
create table data_province
(
`run_date` date not null comment '日期',
data_type_id int not null comment '数据类型',
data decimal(24, 8) not null comment '数据值',
create_time datetime not null default current_timestamp comment '创建时间'
)
engine = olap unique key(`run_date`,`data_type_id`)
comment "分日数据表"
partition by range(`run_date`) ( )
distributed by hash(`data_type_id`) buckets 10
properties (
"storage_format" = "V2",
"enable_unique_key_merge_on_write" = "true",
"dynamic_partition.enable" = "true",
"dynamic_partition.time_unit" = "month",
"dynamic_partition.create_history_partition" = "true",
"dynamic_partition.history_partition_num" = "10",
"dynamic_partition.start" = "-6",
"dynamic_partition.end" = "3",
"dynamic_partition.prefix" = "p",
"dynamic_partition.replication_num" = "1",
"dynamic_partition.buckets" = "10"
);
- Insert data by stream load like this,and you can find that field "create_time" is the time you create table,however,when I try insert data by insert-into method,everything is ok,the field "create_time" is the time I insert record.
# stream load导入 默认的时间固定为建表时间,insert into方式则会是正常的插入记录的时间
# vim /tmp/test.csv
# 2024-01-01,1,67200.00000000
curl --location-trusted -u root \
-H "partial_columns:true" \
-H "column_separator:," \
-H "columns:run_date,data_type_id,data" \
-H "two_phase_commit:false" \
-H "label:stream_load_test01" \
-T /tmp/test.csv http://127.0.0.1:8030/api/iotest/data_province/_stream_load
- If I import by specifying
-H "columns: current_timestamp()"
,the field "create_time" is the time I insert record,but when I insert new record with the same key field,this filed will change.I just want to save the time I create this record.
# 部分列导入可以生成正确的默认时间,但每一次相同key的记录导入会把
# create_time也覆盖成最新的时间
curl --location-trusted -u root \
-H "partial_columns:true" \
-H "column_separator:," \
-H "columns:run_date,data_type_id,data,create_time=current_timestamp()" \
-H "two_phase_commit:false" \
-H "label:stream_load_test02" \
-T /tmp/test.csv http://127.0.0.1:8030/api/iotest/dwd_rd_data_province/_stream_load
Originally posted by @YS0mind in https://github.com/apache/doris-flink-connector/issues/191#issuecomment-1905753329