Improve consistency of expression names
Is your feature request related to a problem or challenge? Please describe what you are trying to do. We have many different ways to create names from expressions with duplicated and sometimes inconsistent code.
Logical Expression:
- Display trait
- Debug trait
Expr.name()method (wrapper forcreate_namefunction)ExprIdentifierVisitor::desc_expr
Physical Expression:
- Display trait
- Debug trait
create_physical_namefunction
One example of confusion is that queries sometimes result in field names containing Divide and sometimes /. For example:
decimal_simple.c1 / CAST(Float64(0.00001) AS Decimal128(5, 5))uses/CAST(decimal_simple.c1 AS Decimal128(30, 19)) Divide CAST(decimal_simple.c5 AS Decimal128(30, 19))usesDivide
Describe the solution you'd like Make names more consistent and avoid duplicate code
Describe alternatives you've considered None
Additional context None
Here are some observations after reviewing the logical plan code.
DisplayandDebugproduce similar output.Displaydelegates toDebugfor most cases.create_nameaccepts a schema argument, which is never used. If we remove that, then the logic looks to be about the same asDisplayandDebug
It seems to me that create_name should be used to determine the name of the expression as it would be represented in a schema, whereas Display would be used to create a representation of the expression to show in the logical plan. In many cases, these will be the same.