greptimedb icon indicating copy to clipboard operation
greptimedb copied to clipboard

Binary data type for table creation

Open sunng87 opened this issue 2 years ago • 4 comments

What type of enhancement is this?

Other

What does the enhancement do?

Internally we have a data type Binary for bytes, blob data. But currently it's not supported when creating table using sql:

CREATE TABLE bytestest (
    host STRING,
    body BINARY,
    ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY(host),
    TIME INDEX(ts)
);

Encountered error as:

Query engine output error: Failed to execute query: CREATE TABLE bytestest ( host STRING, body BINARY, ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY(host), TIME INDEX(ts) );, source: Failed to parse SQL, source: SQL data type not supported yet: Custom(ObjectName([Ident { value: "BINARY", quote_style: None }]))

Implementation challenges

No response

sunng87 avatar Dec 06 '22 03:12 sunng87

Current sqlparser 0.15.0 will parse the sql into DataType::Custom, for supporting BinaryType, we need upgrade the sqlparser's version bigger than ~~0.26.0~~ 0.24.0

But sqlparser 0.15.0 is datafusion's dependency and it seems not easy to upgrade it :joy:

yfractal avatar Dec 11 '22 15:12 yfractal

Could we use varbinary? Looks like sqlparser supports this type. BTW, it's able to parse the content inside DataType::Custom, here is an example

evenyag avatar Dec 12 '22 03:12 evenyag

I think you mean we can use DataType::Custom for now? And the sqlparser has been updated, we can adjust the code to DataType::Binary or Keyword::VARBINARY.

The DataType::Binary and Keyword::VARBINARY are added by https://github.com/sqlparser-rs/sqlparser-rs/pull/618/files#diff-4a04259da480a6b794a2e947e4cc03eff4d1aa9330836f5b91cac68c5398193fR3145-R3148 and released in 0.24.0.

If use DataType::Custom to do this is acceptable, I'd like to give this a try(I'm not sure I can do this :joy:)

yfractal avatar Dec 12 '22 13:12 yfractal

MySQL's binary type is a fixed-size byte string, but our binary type should support variable length. Refer to https://dev.mysql.com/doc/refman/8.0/en/binary-varbinary.html

When BINARY values are stored, they are right-padded with the pad value to the specified length. The pad value is 0x00 (the zero byte). Values are right-padded with 0x00 for inserts, and no trailing bytes are removed for retrievals.

For me, using DataType::Custom is acceptable, but sqlparser-rs 0.15 already has varbinary and PostgreSQL's bytea type:

Maybe we could try bytea first? It looks similar to our binary type.

evenyag avatar Dec 13 '22 03:12 evenyag

MySQL's binary type is a fixed-size byte string, but our binary type should support variable length.

I prefer to keep Binary(size) until we support them using Arrow's FixedSizeBinaryArray. For convenience, we can add support for Binary without size which works like Varbinary. But this exceeds the scope of this PR. We discuss this in the follow up issue.

(from https://github.com/GreptimeTeam/greptimedb/pull/767#issuecomment-1362789758)

waynexia avatar Dec 26 '22 07:12 waynexia