datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

POC: Reduce `Arc` cloning on hashmap build side

Open jonathanc-n opened this issue 5 months ago • 2 comments

Which issue does this PR close?

  • Closes #.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

jonathanc-n avatar Jun 11 '25 21:06 jonathanc-n

I've noticed that it is possible for interleave to perform worse than take despite the Arc clones from take. This happens twice as well for equal_row_arr and build_batch_from_indices.

jonathanc-n avatar Jun 11 '25 22:06 jonathanc-n

I've noticed that it is possible for interleave to perform worse than take despite the Arc clones from take. This happens twice as well for equal_row_arr and build_batch_from_indices.

Yes that will be the tricky part. We gain some speed during probing by concatenating the left side (which itself can be slow) into a singe batch.

Dandandan avatar Jun 13 '25 05:06 Dandandan