datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

REmove workaround for `COUNT(*)` in subquery decorrelation code

Open alamb opened this issue 1 year ago • 0 comments

Is your feature request related to a problem or challenge?

While working on https://github.com/apache/datafusion/pull/10500 I found reference to "the count" bug in the code but it wasn't clear it was tracked by any ticket

@comphead figured out https://github.com/apache/datafusion/pull/10500#discussion_r1603532658 that if the relevant workaround is disabled, then the following query is incorrect:

Running "subquery.slt"
External error: query result mismatch:
[SQL] SELECT t1_id, (SELECT count(*) FROM t2 WHERE t2.t2_int = t1.t1_int) from t1
[Diff] (-expected|+actual)
    11 1
-   22 0
+   22 NULL
    33 3
-   44 0
+   44 NULL
at test_files/subquery.slt:763

Describe the solution you'd like

Remove the workaround / handle the issue correctly

I am not quite sure what this means (maybe @mingmwang can provide more details if he has time)

Describe alternatives you've considered

No response

Additional context

No response

alamb avatar May 17 '24 00:05 alamb