risingwave
risingwave copied to clipboard
SqlSmith Bug hunt: Aggregate of Subquery results in panic
There is 1 out of 512 query that show this error which previous attempt to run 512 test that doesn,t fail.
it is error occur around here. https://github.com/singularity-data/risingwave/blob/ae0388aeee470bb82b845e14df96220cf51e3d2a/src/frontend/src/expr/subquery.rs#L72 About hashing.
@xiangjinwu, Could you help me with this?
Seems hash is just not implemented for subquery. Maybe let's implement all the derive(Hash, PartialEq, Eq) for Subquery and handle detection of subquery not unnested elsewhere.
Similar to https://github.com/singularity-data/risingwave/issues/4404#issuecomment-1204725871, Hash
was purposely banned (https://github.com/singularity-data/risingwave/pull/1492#discussion_r840184919). It was not meant to be "detection of subquery not unnested", which should return error to user rather than panic.
As for Clone
, we have a reason (CTE
) to implement it as well as a way to avoid it (clone plan rather than bound query). For Hash
and Eq
, before deriving the traits, let's make sure we are clear on which situations it is needed, and whether there are better alternatives.
Here is a related minimal example. The subquery is an input to an aggregate.
create table t1 (x int);
select min((select x from t1)) from t1;
thread 'tokio-runtime-worker' panicked at 'internal error: entered unreachable code: Subquery Subquery { kind: Scalar, query: BoundQuery { body: Select(BoundSelect { distinct: false, select_items: [$1], aliases: [Some("x")], from: Some(BaseTable(BoundBaseTable { name: "t1", table_id: TableId { table_id: 1002 }, table_catalog: TableCatalog { id: TableId { table_id: 1002 }, associated_source_id: Some(TableId { table_id: 1001 }), name: "t1", columns: [ColumnCatalog { column_desc: ColumnDesc { data_type: Int64, column_id: #1000, name: "_row_id", field_descs: [], type_name: "" }, is_hidden: true }, ColumnCatalog { column_desc: ColumnDesc { data_type: Int32, column_id: #1001, name: "x", field_descs: [], type_name: "" }, is_hidden: false }], order_key: [$0 ASC], pk: [0], distribution_key: [0], is_index_on: None, appendonly: false, owner: 1, vnode_mapping: Some([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]), properties: {}, read_pattern_prefix_column: 0 }, table_indexes: [] })), where_clause: None, group_by: [], having: None, schema: Schema { fields: [x:Int32] } }), order: [], limit: None, offset: None, extra_order_exprs: [] } } has not been unnested'
I'm not sure if https://github.com/singularity-data/risingwave/pull/4386 would help us unnest this case. Its running on a version of main before the Subquery::clone patch was applied, so its still producing the ::clone() error, but takes the same error producing code path as hash.
The code path is as follows:
41: risingwave_frontend::optimizer::plan_node::logical_project::LogicalProjectBuilder::add_expr
at ./src/frontend/src/optimizer/plan_node/logical_project.rs:47:28
42: <risingwave_frontend::optimizer::plan_node::logical_agg::LogicalAggBuilder as risingwave_frontend::expr::expr_rewriter::ExprRewriter>::rewrite_agg_call::{{closure}}
at ./src/frontend/src/optimizer/plan_node/logical_agg.rs:670:29
Originating in
risingwave_frontend::optimizer::plan_node::logical_agg::LogicalAgg::create
Btw, as a side note, @marvenlee2486 , is it possible for the tool to output the backtrace alongside the error so we know what codepath is triggering the errors?
We can run the output query starting from the definition of tables to the failing query when we compiled ./risedev d
, and then we can get the log file from frontend.
Alternatively, probably can try see if https://github.com/singularity-data/risingwave/issues/4415 are able to do if, if no then stick with the first one
I'd rather not to support subquery expr together with agg. It is far from just enabling Hash
and Eq
derivation. That being said, we should update binder/planner to report proper error rather than panic.
@xiangjinwu I think agg of subquery is more or less rewritable as a subquery with agg.
It may not always be true, however, for instance, agg of CTE, where we want to use CTE elsewhere.
I think for CTE case we'd actually be pretty ok to support this, as we never need to unnest correlated vars from CTE, and for impl, we can evaluate CTE once (batch) or as mview (stream).
Anw, I'll take a look if it's too difficult to support, if it is, I'll just emit error from bind_expr if agg expr contains_subquery
cc @jon-chuang Are you still looking at this?