greptimedb
greptimedb copied to clipboard
Binary data type for table creation
What type of enhancement is this?
Other
What does the enhancement do?
Internally we have a data type Binary
for bytes, blob data. But currently it's not supported when creating table using sql:
CREATE TABLE bytestest (
host STRING,
body BINARY,
ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY(host),
TIME INDEX(ts)
);
Encountered error as:
Query engine output error: Failed to execute query: CREATE TABLE bytestest ( host STRING, body BINARY, ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY(host), TIME INDEX(ts) );, source: Failed to parse SQL, source: SQL data type not supported yet: Custom(ObjectName([Ident { value: "BINARY", quote_style: None }]))
Implementation challenges
No response
Current sqlparser 0.15.0
will parse the sql into DataType::Custom
, for supporting BinaryType
, we need upgrade the sqlparser
's version bigger than ~~0.26.0~~ 0.24.0
But sqlparser 0.15.0
is datafusion
's dependency and it seems not easy to upgrade it :joy:
Could we use varbinary
? Looks like sqlparser
supports this type. BTW, it's able to parse the content inside DataType::Custom
, here is an example
I think you mean we can use DataType::Custom
for now? And the sqlparser
has been updated, we can adjust the code to DataType::Binary
or Keyword::VARBINARY
.
The DataType::Binary
and Keyword::VARBINARY
are added by https://github.com/sqlparser-rs/sqlparser-rs/pull/618/files#diff-4a04259da480a6b794a2e947e4cc03eff4d1aa9330836f5b91cac68c5398193fR3145-R3148 and released in 0.24.0.
If use DataType::Custom
to do this is acceptable, I'd like to give this a try(I'm not sure I can do this :joy:)
MySQL's binary
type is a fixed-size byte string, but our binary type should support variable length. Refer to https://dev.mysql.com/doc/refman/8.0/en/binary-varbinary.html
When BINARY values are stored, they are right-padded with the pad value to the specified length. The pad value is 0x00 (the zero byte). Values are right-padded with 0x00 for inserts, and no trailing bytes are removed for retrievals.
For me, using DataType::Custom
is acceptable, but sqlparser-rs 0.15
already has varbinary
and PostgreSQL's bytea
type:
Maybe we could try bytea
first? It looks similar to our binary type.
MySQL's binary type is a fixed-size byte string, but our binary type should support variable length.
I prefer to keep Binary(size) until we support them using Arrow's FixedSizeBinaryArray. For convenience, we can add support for Binary without size which works like Varbinary. But this exceeds the scope of this PR. We discuss this in the follow up issue.
(from https://github.com/GreptimeTeam/greptimedb/pull/767#issuecomment-1362789758)