Regression in reading Iceberg v2 table since v3.2.2
Operating against a table in an external iceberg/unified catalog in v3.2.6 gives a nondescript error (see stacktrace below)
I tried tracking down the issue by diffing v3.2.6 against v3.2.2 and I think this change is what causes it: https://github.com/StarRocks/starrocks/commit/9ab128dcbfa3e464181374b6e493d0e91c62937c#diff-62955f4651c70b99d1f502bc07103c278a9d80ef7bfb9eba6fa90cafb8d943fbL155-R158
Note, doing show create table hive_catalog.db.table in StarRocks gives:
CREATE TABLE `table` (
`id` int(11) DEFAULT NULL,
`countrycode` varchar(1048576) DEFAULT NULL,
`department` varchar(1048576) DEFAULT NULL,
`name` varchar(1048576) DEFAULT NULL,
`postbox` varchar(1048576) DEFAULT NULL,
`postcode` varchar(1048576) DEFAULT NULL,
`searchstring` varchar(1048576) DEFAULT NULL,
`street` varchar(1048576) DEFAULT NULL,
`town` varchar(1048576) DEFAULT NULL,
`latitude` double DEFAULT NULL,
`longitude` double DEFAULT NULL,
`street2` varchar(1048576) DEFAULT NULL,
`province` varchar(1048576) DEFAULT NULL,
`_cdc` struct<op varchar(1048576), ts datetime, offset bigint(20), source varchar(1048576), target varchar(1048576), key struct<id int(11)>> DEFAULT NULL
)
PARTITION BY ( )
PROPERTIES ("location" = "s3a://hive_catalog/db/table");
While doing the same in Trino gives:
CREATE TABLE hive_catalog.db.table (
id integer NOT NULL,
countrycode varchar NOT NULL,
department varchar,
name varchar NOT NULL,
postbox varchar,
postcode varchar,
searchstring varchar NOT NULL,
street varchar,
town varchar,
latitude double,
longitude double,
street2 varchar,
province varchar,
_cdc ROW(op varchar, ts timestamp(6) with time zone, offset bigint, source varchar, target varchar, key ROW(id integer)) NOT NULL
)
WITH (
format = 'PARQUET',
format_version = 2,
location = 's3a://hive_catalog/db/table',
partitioning = ARRAY['day("_cdc.ts")']
)
Especially note the empty PARTITION BY( ) coming from StarRocks. I think this is the issue.
Real behavior (Required)
An error:
2024-05-17 21:39:52.730Z WARN (starrocks-mysql-nio-pool-9|10602) [StmtExecutor.execute():708] execute Exception, sql /* ApplicationName=DBeaver 24.0.4 - SQLEditor */ select id from hive_catalog.db.table
LIMIT 0, 200
java.lang.NullPointerException: null
at com.starrocks.sql.optimizer.operator.logical.LogicalIcebergScanOperator.lambda$new$0(LogicalIcebergScanOperator.java:49) ~[starrocks-fe.jar:?]
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) ~[?:?]
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655) ~[?:?]
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?]
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?]
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) ~[?:?]
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?]
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) ~[?:?]
at com.starrocks.sql.optimizer.operator.logical.LogicalIcebergScanOperator.(LogicalIcebergScanOperator.java:49) ~[starrocks-fe.jar:?]
at com.starrocks.sql.optimizer.transformer.RelationTransformer.visitTable(RelationTransformer.java:562) ~[starrocks-fe.jar:?]
at com.starrocks.sql.optimizer.transformer.RelationTransformer.visitTable(RelationTransformer.java:144) ~[starrocks-fe.jar:?]
at com.starrocks.sql.ast.TableRelation.accept(TableRelation.java:182) ~[starrocks-fe.jar:?]
at com.starrocks.sql.ast.AstVisitor.visit(AstVisitor.java:68) ~[starrocks-fe.jar:?]
at com.starrocks.sql.ast.AstVisitor.visit(AstVisitor.java:64) ~[starrocks-fe.jar:?]
at com.starrocks.sql.optimizer.transformer.QueryTransformer.planFrom(QueryTransformer.java:172) ~[starrocks-fe.jar:?]
at com.starrocks.sql.optimizer.transformer.QueryTransformer.plan(QueryTransformer.java:87) ~[starrocks-fe.jar:?]
at com.starrocks.sql.optimizer.transformer.RelationTransformer.visitSelect(RelationTransformer.java:263) ~[starrocks-fe.jar:?]
at com.starrocks.sql.optimizer.transformer.RelationTransformer.visitSelect(RelationTransformer.java:144) ~[starrocks-fe.jar:?]
at com.starrocks.sql.ast.SelectRelation.accept(SelectRelation.java:242) ~[starrocks-fe.jar:?]
at com.starrocks.sql.ast.AstVisitor.visit(AstVisitor.java:68) ~[starrocks-fe.jar:?]
at com.starrocks.sql.ast.AstVisitor.visit(AstVisitor.java:64) ~[starrocks-fe.jar:?]
at com.starrocks.sql.optimizer.transformer.RelationTransformer.transform(RelationTransformer.java:213) ~[starrocks-fe.jar:?]
at com.starrocks.sql.optimizer.transformer.RelationTransformer.transformWithSelectLimit(RelationTransformer.java:181) ~[starrocks-fe.jar:?]
at com.starrocks.sql.StatementPlanner.createQueryPlan(StatementPlanner.java:194) ~[starrocks-fe.jar:?]
at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:134) ~[starrocks-fe.jar:?]
at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:91) ~[starrocks-fe.jar:?]
at com.starrocks.qe.StmtExecutor.execute(StmtExecutor.java:520) ~[starrocks-fe.jar:?]
at com.starrocks.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:413) ~[starrocks-fe.jar:?]
at com.starrocks.qe.ConnectProcessor.dispatch(ConnectProcessor.java:607) ~[starrocks-fe.jar:?]
at com.starrocks.qe.ConnectProcessor.processOnce(ConnectProcessor.java:901) ~[starrocks-fe.jar:?]
at com.starrocks.mysql.nio.ReadListener.lambda$handleEvent$0(ReadListener.java:69) ~[starrocks-fe.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at java.lang.Thread.run(Thread.java:829) ~[?:?]
StarRocks version (Required)
3.2.6
Further investigation reveals that this might be due to us partitioning our iceberg-tables on a nested (ROW) column.
We have marked this issue as stale because it has been inactive for 6 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to StarRocks!