datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Consider using `Arc::clone` to clone Arcs to make it clear they aren't deep copies

Open alamb opened this issue 1 year ago • 1 comments

Is your feature request related to a problem or challenge?

in https://github.com/apache/datafusion/pull/11103#discussion_r1653071246 @comphead noted that it is sometimes hard to tell if .clone() is a deep clone (and thus expensive) or a clone of an Arc and thus much less so

when I see .clone() Im now thinking what if we comment it the similar way as .unwrap() back in the day. Like say clone is cheap here because it is Arc::clone so only reference gets cloned.... thats just an idea, its too cumbersome to make it happen

Describe the solution you'd like

It would be nice to make it clearer in the code when we had deep clones and when we were just cloning arcs

Describe alternatives you've considered

One thing we do in InfluxDB is use this pattern to make it exlicit

            Arc::clone(ctx.state().catalog_list())

There is a clippy lint https://rust-lang.github.io/rust-clippy/v0.0.212/#clone_on_ref_ptr we could turn on in datafusion to enable this

Here is how it is done in influxdb:

https://github.com/influxdata/influxdb3_core/blob/0f5ecbd6b17f83f7ad4ba55699fc2cd3e151cf94/Cargo.toml#L117-L118

Additional context

No response

alamb avatar Jun 27 '24 11:06 alamb

Wow, this clippy lint is so helpful

jayzhan211 avatar Jun 28 '24 00:06 jayzhan211

Thanks for this lint, i like it very much.

Would it be possible to convert #![deny(clippy::clone_on_ref_ptr)] scattered across the code base into a single workspace-level definition?

findepi avatar Aug 07 '25 15:08 findepi

Sounds like a good plan to me -- I recommend filing a new ticket (and marking it as "good first issue") -- I bet you'll find someone does it

alamb avatar Aug 07 '25 16:08 alamb

  • https://github.com/apache/datafusion/issues/17083

I think we can otherwise close this issue?

findepi avatar Aug 08 '25 09:08 findepi