[FEATURE] trino connector support more Iceberg partitions
Describe the feature
trino connector support more Iceberg partitions
Motivation
For a table created with non identify partition, trino failed to query the data.
create table abcd(a int, b int) partitioned by (bucket(2,a)) TBLPROPERTIES ('format-version'='2', 'write.merge.mode'='merge-on-read', 'write.delete.mode'='merge-on-read');
Query 20240829_032202_00093_fk3q7 failed: class org.apache.gravitino.rel.expressions.transforms.Transforms$BucketTransform cannot be cast to class org.apache.gravitino.rel.expressions.transforms.Transform$SingleFieldTransform (org.apache.gravitino.rel.expressions.transforms.Transforms$BucketTransform and org.apache.gravitino.rel.expressions.transforms.Transform$SingleFieldTransform are in unnamed module of loader io.trino.server.PluginClassLoader @366fd3cb)
Describe the solution
No response
Additional context
No response
The table's metadata is:
{
"code": 0,
"table":
{
"name": "abcd",
"columns":
[
{
"name": "a",
"type": "integer",
"nullable": true,
"autoIncrement": false
},
{
"name": "b",
"type": "integer",
"nullable": true,
"autoIncrement": false
}
],
"properties":
{
"owner": "root",
"write.merge.mode": "merge-on-read",
"current-snapshot-id": "3663013567918800433",
"write.delete.mode": "merge-on-read",
"provider": "iceberg",
"write.parquet.compression-codec": "zstd",
"format": "iceberg/parquet",
"format-version": "2",
"location": "hdfs://10.20.31.19:9000/user/iceberg-jdbc/warehouse/mydatabase/abcd",
"write.distribution-mode": "none"
},
"audit":
{
"creator": "anonymous",
"createTime": "2024-08-29T03:19:55.008771798Z"
},
"distribution":
{
"strategy": "none",
"number": 0,
"funcArgs":
[]
},
"sortOrders":
[],
"partitioning":
[
{
"strategy": "bucket",
"numBuckets": 2,
"fieldNames":
[
[
"a"
]
]
}
],
"indexes":
[]
}
}
Trino only support the partitioning like this patten partitioning = ARRAY['c1', 'c2']
The table can be show by the iceberg connector catalog
CREATE TABLE iceberg.mydatabase.abcd (
a integer,
b integer
)
WITH (
format = 'PARQUET',
format_version = 2,
location = 'hdfs://10.20.31.19:9000/user/iceberg-jdbc/warehouse/mydatabase/abcd',
partitioning = ARRAY['bucket(a, 2)']
)
@mchades @yuqi1129 How can we solve the problem of the Transform expression parser to handle the string bucket(a, 2) and Transform
At least, we need to fix the query issues.
@mchades @yuqi1129 How can we solve the problem of the
Transform expression parserto handle the stringbucket(a, 2)and Transform
Trino Iceberg connector supports this, I think you can reference its codes. Here is the doc: https://trino.io/docs/current/connector/iceberg.html#partitioned-tables