datafusion
datafusion copied to clipboard
REmove workaround for `COUNT(*)` in subquery decorrelation code
Is your feature request related to a problem or challenge?
While working on https://github.com/apache/datafusion/pull/10500 I found reference to "the count" bug in the code but it wasn't clear it was tracked by any ticket
@comphead figured out https://github.com/apache/datafusion/pull/10500#discussion_r1603532658 that if the relevant workaround is disabled, then the following query is incorrect:
Running "subquery.slt"
External error: query result mismatch:
[SQL] SELECT t1_id, (SELECT count(*) FROM t2 WHERE t2.t2_int = t1.t1_int) from t1
[Diff] (-expected|+actual)
11 1
- 22 0
+ 22 NULL
33 3
- 44 0
+ 44 NULL
at test_files/subquery.slt:763
Describe the solution you'd like
Remove the workaround / handle the issue correctly
I am not quite sure what this means (maybe @mingmwang can provide more details if he has time)
Describe alternatives you've considered
No response
Additional context
No response