discuss: re-export arrow types
Hi, I propose to re-export arrow types in iceberg. Want to hear your opinions @liurenjie1024 @Xuanwo
Rationale
When a crate uses a dependency's type in the public API, it would be better to re-export the dependency.
The largest benefit is ease of handling multiple versions:
[dependencies]
# arrow used in the project
arrow = "50"
# which depends on arrow 52
iceberg = "0.3"
In this case, if we want to pass arrow_array::RecordBatch to iceberg, we have to do something like this:
[dependencies]
# arrow used in the project
arrow-array = "50"
# which depends on arrow 52
iceberg = "0.3"
arrow-array-for-iceberg = { package="arrow", version="52"}
and then pass arrow_array_for_arrow_udf::RecordBatch to iceberg. If iceberg re-export it like iceberg::arrow::RecordBatch, then we don't need to do so.
Note: This assumes that multiple versions are to some extent unavoidable. It won't help if we want to pass arrow 50 data to arrow 52 data. But it's more like some part of the code only want to pass data to iceberg without affecting other parts of the code that uses arrow 50.
Finally, I don't think there are any drawbacks to re-export arrow in iceberg.
More discussion about this problem:
- https://github.com/arrow-udf/arrow-udf/issues/65
- lurklurk.org/effective-rust/re-export.html
- brokenco.de/2023/07/26/rust-re-export.html (This talks about why
delta-rsre-exportarrow_array::RecordBatch, highly similar to us here)
Example about this practice: https://docs.rs/deltalake/0.20.0/deltalake/arrow/index.html
https://github.com/delta-io/delta-rs/blob/b3b23be91b8eb8207473785d4b0c20c5d2c174b2/crates/core/src/lib.rs#L101-L102
+1 for this, which helps to decipher in compatible types with same name. cc @Xuanwo What do you think?
-0 on this.
I'm not in favor of re-exporting other crates based on my own experience. However, as @xxchan mentioned, there aren't many drawbacks, so I'm okay if someone wants to add it.
One case I think this helps is that when another project uses both arrow and iceberg, but with different versions, just as the case in the issue. When you pass them as arguments, you may see quite confusing compiler error like arrow::RecordBatch is not arrow::RecordBatch, since rust treat types from diffrent crate versions as different types. A more detailed example could be found here. Re-exporting makes things easier to understand.
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.
This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'