horaedb
horaedb copied to clipboard
Column order is not what user define when create table
Describe this problem
Table' columns order is not what user define when create table
Steps to reproduce
Execute those SQL
CREATE TABLE test_order (
host string tag,
v2 int,
ts timestamp NOT NULL,
v1 int,
timestamp KEY (ts)
)ENGINE = Analytic WITH (
enable_ttl = 'false'
);
desc test_order;
desc will output
ts timestamp 1 0 0
tsid uint64 1 0 0
host string 0 1 1
v1 int 0 1 1
v2 int 0 1 0
Expected behavior
The column order should be same with the order when created.
Additional Information
https://github.com/CeresDB/ceresdb/blob/3e826ffbb98d2c5351e63b28dd9dd4644e7d4c1e/sql/src/planner.rs#L265
The root cause is currently we use a BTreeMap when build create table plan, it should use a vector to keep the ordering
ref #154
After some thoughts, I find to fix this issue is a non-trivial task.
Currently CeresDB will put primary key columns at the front, and use a num_key_columns in Schema to indicate how many columns are in primary key, so if we keep the original column ordering, then we need more info to know which column belongs to primary key.
Can we reorder fields by subscripting the fields? like
https://github.com/CeresDB/ceresdb/blob/5c14f0055fb6816793b81bbc50c61b12b3079d5c/server/src/handlers/sql.rs#L201-L207
We can organize columns like what you said, but the more tricky issue is the order is used for primary key identification.
The first num_key_columns columns are primary key, if we keep the original order, then we need to keep some info to tell which columns are primary key
This looks like the BTree needs to be removed, will it infect the storage layer?
will it infect the storage layer?
I think it will.
The rows within one table will be ordered by primary key. For example, the key is composed of primary key + sequence in memtable
- https://github.com/CeresDB/ceresdb/blob/a74eb6b5cbb89ef03cf7cf42a464d584bdd0b744/analytic_engine/src/memtable/skiplist/mod.rs#L84
- https://github.com/CeresDB/ceresdb/blob/a74eb6b5cbb89ef03cf7cf42a464d584bdd0b744/analytic_engine/src/memtable/key.rs#L123
If we want to fix this, we can remove num_key_columns() method in schema, and replace all its usage with key_columns().
FYI, num_key_columns is only used in a few places, so this change may not be very large.

maybe I can try to fix this, can it be assigned to me?
Sure, go ahead.