datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Stop copying LogicalPlan and Exprs in `PushDownLimit`

Open alamb opened this issue 1 year ago • 1 comments

Is your feature request related to a problem or challenge?

Part of https://github.com/apache/datafusion/issues/9637

As part of making the planner faster, we are updating the optimizer passes to avoid copying LogicalPlan and Expr (see https://github.com/apache/datafusion/issues/9637)

Describe the solution you'd like

I would like to reduce the amount of copying in this pass (even though it doesn't appear in current profiling)

Describe alternatives you've considered

Apply the model from @Lordworms in https://github.com/apache/datafusion/pull/10166 to this pass 2. Update OptimizerRule::supports_rewrite` to return true

  1. Update OptimizerRule to use rewrite
  2. Update the pass itself to not copy the LogicalPlan (ideally using the TreeNode API) - it is implemented for LogicalPlan (API) and Expr (API)

Other examples: https://github.com/apache/datafusion/pull/10218

Additional context

alamb avatar Apr 29 '24 12:04 alamb

take

kavirajk avatar May 03 '24 19:05 kavirajk

@kavirajk -- have you had a chance to work on this issue? If not, do you mind if someone else does?

alamb avatar May 14 '24 11:05 alamb

I made https://github.com/apache/datafusion/pull/10501 to pull some of the code out into a function to maybe make this easier

alamb avatar May 14 '24 11:05 alamb

@kavirajk -- have you had a chance to work on this issue? If not, do you mind if someone else does?

Sorry @alamb didn't get a chance to work on it unfortunately. Agree, feel free to work on it.

kavirajk avatar May 14 '24 12:05 kavirajk

No worries. Thanks @kavirajk

alamb avatar May 14 '24 12:05 alamb

I bashed out a PR for this change this morning: https://github.com/apache/datafusion/pull/10508

alamb avatar May 14 '24 16:05 alamb