datafusion
datafusion copied to clipboard
short-circuited expression should be evaluated one by one
Describe the bug
In our current implementation, the sub-expression within the short-circuited expression is always evaluated even when they don't need to be evaluated. We first evaluate all expressions recursively and then perform the corresponding operation or function.
The related code is below
the behavior of COALESCE function is wrong
https://github.com/apache/arrow-datafusion/blob/b7e13a0af711477ad41450566c14430089edd3f2/datafusion/physical-expr/src/scalar_function.rs#L152-L156
the behavior of And and Or operation is wrong
https://github.com/apache/arrow-datafusion/blob/b7e13a0af711477ad41450566c14430089edd3f2/datafusion/physical-expr/src/expressions/binary.rs#L262-L263
the behavior of CASE ... WHEN ... is correct
https://github.com/apache/arrow-datafusion/blob/b7e13a0af711477ad41450566c14430089edd3f2/datafusion/physical-expr/src/expressions/case.rs#L136-L140
To Reproduce
DataFusion CLI v34.0.0
❯ create table t(x int, y int) as values (1,1), (2,2), (3,3), (0,0), (4,0);
0 rows in set. Query took 0.004 seconds.
❯ select y > 0 and 1/y < 1 from t;
Arrow error: Divide by zero error
❯ select y = 0 or 1/y < 1 from t;
Arrow error: Divide by zero error
❯ select COALESCE(1, 1/y) from t;
Arrow error: Divide by zero error
❯ select case 1 when 2 then 1/y end from t;
+--------------------------------------------------------------------+
| CASE Int64(1) WHEN Int64(2) THEN Int64(1) / CAST(t.y AS Int64) END |
+--------------------------------------------------------------------+
| |
| |
| |
| |
| |
+--------------------------------------------------------------------+
5 rows in set. Query took 0.003 seconds.
Expected behavior
The short-circuited expressions should have the correct results. I also check in postgres
postgres=# create table t(x int, y int); insert into t values (1,1), (2,2), (3,3), (0,0), (4,0);
CREATE TABLE
INSERT 0 5
postgres=# select y > 0 and 1/y < 1 from t;
?column?
----------
f
t
t
f
f
(5 rows)
postgres=# select y = 0 or 1/y < 1 from t;
?column?
----------
f
t
t
t
t
(5 rows)
postgres=# select COALESCE(1, 1/y) from t;
coalesce
----------
1
1
1
1
1
(5 rows)
Additional context
No response
I prepare to handle it, If there is any progress, I'll record it under this issue
Hi @jackwener, I have some thoughts on #8959 about implementing short circuit functions(like coalesce), do you have a time to take a look?
I also find some very special cases when I meet the error. This is example sql:
SELECT
(
case
when SUM(col1) != 0 then SUM(col1) / SUM(col2)
else 0
end
) as metric
FROM
table
having
(
metric > 0
)
This a case when case, but the filter metric > 0 will changed to and filter in the simplify_expressions optimization rule
I also find some very special cases when I meet the error. This is example sql:
SELECT ( case when SUM(col1) != 0 then SUM(col1) / SUM(col2) else 0 end ) as metric FROM table having ( metric > 0 )This a case when case, but the filter
metric > 0will changed toandfilter in thesimplify_expressionsoptimization rule
In the simplify_expressions rule, we convert the CASE WHEN to filter but ignore the corner case.
The above plan will get the filter plan like:
filter: SUM(col1) != 0 and SUM(col1) / SUM(col2) > 0
Due to the bug: the behavior of And and Or operation is wrong https://github.com/apache/arrow-datafusion/blob/b7e13a0af711477ad41450566c14430089edd3f2/datafusion/physical-expr/src/expressions/binary.rs#L262-L263
We will get error in the runtime.
I also find error, that short-circuited logical is invalid
DataFusion CLI v39.0.0
> create table if not exists test_tb(c1 int, c2 int, c3 varchar);
0 row(s) fetched.
Elapsed 0.002 seconds.
> insert into test_tb(c1, c2, c3) values(1,0,'1'),(2,2,'2'),(3,2,'3');
+-------+
| count |
+-------+
| 3 |
+-------+
1 row(s) fetched.
Elapsed 0.003 seconds.
> select * from test_tb;
+----+----+----+
| c1 | c2 | c3 |
+----+----+----+
| 1 | 0 | 1 |
| 2 | 2 | 2 |
| 3 | 2 | 3 |
+----+----+----+
3 row(s) fetched.
Elapsed 0.002 seconds.
> select c1 != 1 and c1 / c2 = 1 from test_tb;
Arrow error: Divide by zero error
The c1 in the first row does not meet the condition c1 !=1 In theory, it is not necessary to execute the sub-expr c1 / c2 = 1
I think this idea was largely implemented by @acking-you in
- https://github.com/apache/datafusion/pull/15694
We merged a fix for this in the 51 release, but we have found as subtle problem so I am planning to revert it for 51 (the fix should appear in 52):
- https://github.com/apache/datafusion/pull/18567