datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Avoid copies in `TypeCoercion` via TreeNode API

Open alamb opened this issue 1 year ago • 1 comments

Draft as it has a failure in tpchds planning for some reason

Which issue does this PR close?

Closes https://github.com/apache/datafusion/issues/10210

Part of https://github.com/apache/arrow-datafusion/issues/9637 -- let's make DataFusion planning faster by not copying so much

Rationale for this change

Now that we have the nice TreeNode API thanks to #8913 and @peter-toth let's use it to both simplify the code and avoid copies

What changes are included in this PR?

  1. Avoid copies in TypeCoercion via TreeNode API

Are these changes tested?

Existing CI

Are there any user-facing changes?

alamb avatar Apr 10 '24 22:04 alamb

An update here is twofold:

  1. There are some very subtle semantics going on
  2. As this pass actually changes the types of the plan (on purpose) we need some way to recalculate the schemas (which is what Plan::with_new_exprs does, but it also requires a copy of the inputs and exprs, which is not cool

I am still putzing with how to make this better

alamb avatar Apr 23 '24 19:04 alamb

superceded by https://github.com/apache/datafusion/pull/10356

alamb avatar May 02 '24 19:05 alamb