datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

refactor(joins::utils): Replace OnceAsync/OnceFut with tokio's OnceCell

Open ctsk opened this issue 8 months ago • 4 comments

Rationale for this change

I wrote this PR mostly to show that it can be done. Replacing OnceFut with a OnceCell has some advantages though:

  • It no longer stores a Future in the ExecutionPlan, but just the synchronization primitive. In theory this can allow to make the ExecutionPlan implement Clone too.
  • One fewer custom synchronization primitive to maintain

What changes are included in this PR?

  1. Introduce SharedResultOnceCell - A wrapper around tokio's OnceCell<SharedResult<Arc<T>>>. Arc's are used all over the place to avoid lifetime issues
  2. Replace usages of OnceAsync(OnceFut) with SharedResultOnceCell (SharedResultPending)
  3. Adjust variable names and doc comments in all users of OnceFut
  4. Remove OnceAsync + OnceFut

Are these changes tested?

This PR should not change the behaviour of the join implementations.

Are there any user-facing changes?

No.

ctsk avatar Mar 26 '25 08:03 ctsk

Thanks @ctsk -- this idea sounds good -- I think we should run the benchmarks to make sure they don't cause issues. I'll do so shortly

alamb avatar Mar 27 '25 19:03 alamb

I'm moving this back to the draft stage, because it will conflict with https://github.com/apache/datafusion/pull/15476 once I fix that PR.

I can't see a way to achieve the goals of this PR and https://github.com/apache/datafusion/pull/15476 at the same time with a single OnceCell. I suspect that if I pursue #15476, I can not avoid storing a SendableRecordBatchStream or some other Future in the ExecutionPlan.

So overall, it seems to me like this PR is less clear of a win.

ctsk avatar Apr 01 '25 13:04 ctsk

@ctsk thank you so much for this PR (and others) can you share more about what you are doing with DataFusion? If you would like to have a brief introduction video call, feel free to email me at [email protected] and we can setup something

alamb avatar Apr 01 '25 16:04 alamb

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

github-actions[bot] avatar Jun 12 '25 02:06 github-actions[bot]