datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Add test to verify issue #9161

Open jonahgao opened this issue 1 year ago • 1 comments

Which issue does this PR close?

Closes #9161.

Rationale for this change

In issue #9161, the expression cast(a as int) in the select list generated a unqualified column named 't.a', which was then incorrectly referenced by the group by clause, leading to a schema error.

After merging PR #9228, the group by clause will now preferentially reference the column of the same name from the input, rather than the one in the select list, thereby resolving the issue.

I suspect there will still be similar issues, but we may need to find a new case to verify whether they truly exist.

What changes are included in this PR?

Add a test.

Are these changes tested?

Yes

Are there any user-facing changes?

No

jonahgao avatar Feb 19 '24 02:02 jonahgao

Thanks @jonahgao I still don't get how this PR closes the issue 🤔

Previously, to parse column references in the 'GROUP BY' expression, we would sequentially search for the column names in the output schema of the select list and then in the input plan.

However, the cast expression generated an output column named "t.a", which has the same name as the column reference in the group by, thus resulting in an incorrect match, which should not happen.

After PR #9228, we first searched from the input plan and then obtained the correct match, which is the column a of table t.

I think there are still bugs concerning the naming of cast expressions and the planning of 'GROUP BY' clauses. But in this case, they cannot be reproduced. I need to find a new one.

jonahgao avatar Feb 20 '24 02:02 jonahgao

Thank you @jonahgao -- sorry for the delay in reviewing

I think this PR is nice that it shows a problem doesn't exist and ensures it won't start causing problems in the future

Thank you for the review! ❤️

jonahgao avatar Feb 28 '24 12:02 jonahgao