iceberg-rust icon indicating copy to clipboard operation
iceberg-rust copied to clipboard

proposal: Roadmap of iceberg-rust v0.2

Open Xuanwo opened this issue 2 years ago • 5 comments

As discussed in our previous meeting, we plan to provide the following feature in iceberg-rust v0.1:

Required

Features that must have in this release.

  • specs: all needed data types to represents an iceberg table in memory
  • arrow interop: allow interop between arrow
  • io: integrate with different storage services via opendal

Optional

Features that good to have in this release.

  • compilation to wasm/wasi so users can use iceberg from web browser.

Possibility

Features that we already known to have in the future.

  • Integrate with datafusion
  • Providing catalog trait to allow vendors can implement a catalog.
  • Implement catalogs (REST, Glue, Databend, RisingWave, ...)
  • The standalone reader/writer api
  • Table transaction API.

Notes

  • iceberg v0.1 already been taken, so we will take v0.2 as our first release.
  • opendal covers all storage services that already implemented by iceberg and pyiceberg. (maybe more :rofl: )
  • opendal provides a compatible layer that can be used in datafusion (the query engine we can integrate in the future).

Welcome for any comments!

cc @liurenjie1024, @JanKaul, @ZENOTME for review. Also cc @Fokko for ideas.

Xuanwo avatar Jul 22 '23 04:07 Xuanwo

Agree with the proposal.

I think one missing part is the standalone reader/writer api, table transaction api. But I'm ok to postpone it to later releases. About datafusion integration, I think it would be better to put it in another crate, and an example usage of reader/writer api.

liurenjie1024 avatar Jul 24 '23 02:07 liurenjie1024

I think one missing part is the standalone reader/writer api, table transaction api. But I'm ok to postpone it to later releases.

Thanks for the comment. I will add them in possibility section to avoid missing them in next release.

About datafusion integration, I think it would be better to put it in another crate, and an example usage of reader/writer api.

I used to think we can hide them under a feature like delta-rs does. We can discuss the detail after v0.2 released.

Xuanwo avatar Jul 24 '23 02:07 Xuanwo

Hello from glaredb!

We have a wip pr with reading icerberg tables here: https://github.com/GlareDB/glaredb/pull/1382. We lean pretty heavily on datafusion, and so there's likely some bits we'd be able to contribute in that area.

There's still some exploratory work that we're doing with this, but we're definitely looking at the best way to integrate iceberg-rust.

About datafusion integration, I think it would be better to put it in another crate, and an example usage of reader/writer api.

I used to think we can hide them under a feature like delta-rs does. We can discuss the detail after v0.2 released.

For what it's worth, I'd prefer the feature over having a separate crate for datafusion integration.

scsmithr avatar Jul 25 '23 18:07 scsmithr

Hi there, thanks for joining the discussion. I would also put the datafusion integration in another crate.

Since many projects rely on arrow, I think arrow support deserves a spot in iceberg-rust. And this should make it easier to integrate datafusion.

JanKaul avatar Jul 25 '23 18:07 JanKaul

No objections to this proposal. I have created a tracking issue at https://github.com/apache/iceberg-rust/issues/18. Let's get started!

Xuanwo avatar Aug 02 '23 04:08 Xuanwo

Let's close this one since 0.2.0 has been out for a while :)

Fokko avatar Jun 24 '24 20:06 Fokko