paimon-rust icon indicating copy to clipboard operation
paimon-rust copied to clipboard

Tracking issues of 0.1.0 version for Apache Paimon Rust

Open Xuanwo opened this issue 1 year ago • 20 comments

Hello everyone, this is the tracking update for the paimon rust 0.1.0 release.

Goal

Before we outline the tasks for the 0.1.0 release, let me clarify the project's goal:

Developing a complete implementation of Paimon in pure Rust.

  • Users can read/write paimon table like they do in java API.
  • Users can read/write paimon table in arrow format.
  • Native support for DataFusion
    • But also enable users to implement their own query engines based on this project.
  • Native WASM support (a.k.a paimon-wasm)
  • Native Python binding based on rust core (a.k.a paimon-py)
  • Hive catalog support

Tasks

This will be our initial release, and I aim to include basic read support in it.

  • Spec: Implement types that needed by paimon.
    • [x] https://github.com/apache/paimon-rust/pull/5
    • [ ] Datatypes
    • [x] https://github.com/apache/paimon-rust/issues/6
    • [x] https://github.com/apache/paimon-rust/issues/14
    • [ ] https://github.com/apache/paimon-rust/issues/7
    • [x] https://github.com/apache/paimon-rust/issues/10
    • [ ] Global Index
    • [ ] Data File Index
  • Catalog
    • [ ] https://github.com/apache/paimon-rust/pull/62
    • [ ] https://github.com/apache/paimon-rust/issues/70 (leave hive catalog in next release)
  • Arrow Integration:
    • [ ] https://github.com/apache/paimon-rust/issues/71
  • Basic Read Process: No schema evolution, no merging, no deletion.
    • [ ] https://github.com/apache/paimon-rust/issues/72 (filter & project push down are not required in v0.1)
  • IO: Integrate with Apache OpenDAL for IO.
  • Release Utils: Impelment scripts to help generate and verify ASF releases.

After all those tasks, I expect users can read an existing paimon table from storage services.

Xuanwo avatar Jul 05 '24 15:07 Xuanwo

@Xuanwo I will do the Snapshot task.

QuakeWang avatar Jul 06 '24 07:07 QuakeWang

@Xuanwo I will do the Snapshot task.

Welcome, have fun!

Xuanwo avatar Jul 06 '24 07:07 Xuanwo

hello @Xuanwo , I would like to work on implementing Manifest.

dharanad avatar Jul 07 '24 11:07 dharanad

hello @Xuanwo , I would like to work on implementing Manifest.

Great, have fun!

Xuanwo avatar Jul 07 '24 12:07 Xuanwo

Hey, I'm new around here and this project sounds interesting. Can I jump in and help out?

crrow avatar Jul 07 '24 12:07 crrow

Manifest

Thanks a lot, welcome to pick up a part. I'm willing to update tracking issues for you.

Xuanwo avatar Jul 07 '24 12:07 Xuanwo

Manifest

Thanks a lot, welcome to pick up a part. I'm willing to update tracking issues for you.

okay, let me try data file

crrow avatar Jul 07 '24 13:07 crrow

I want to try the manifest list

Asura7969 avatar Jul 08 '24 13:07 Asura7969

Cool, it's good project to be familiar with rust and big data infrastructure😆

thexiay avatar Jul 09 '24 02:07 thexiay

I will do the Changelog task, and the corresponding issue have been created #16.

QuakeWang avatar Jul 09 '24 03:07 QuakeWang

Cool, it's good project to be familiar with rust and big data infrastructure😆

Welcome!

I will do the Changelog task, and the corresponding issue have been created #16.

Updated.

Xuanwo avatar Jul 09 '24 03:07 Xuanwo

Hi, I am a big fan of rust, and I usually do data lake-related development. Be happy to join the development of paimon-rust in the future.

suxiaogang223 avatar Jul 20 '24 05:07 suxiaogang223

Does the FileSystemCatalog rely on the Integrate with Apache OpenDAL ? There are some IO operation in the FileSystemCatalog. @Xuanwo

Aitozi avatar Jul 23 '24 12:07 Aitozi

Does the FileSystemCatalog rely on the Integrate with Apache OpenDAL ?

Yes, my plan is using opendal for those IO operations.

Xuanwo avatar Jul 23 '24 13:07 Xuanwo

I will try to implement the File index part and try to follow the implementation logic of the Java version and use the Rust style as much as possible. @Xuanwo

devillove084 avatar Jul 29 '24 16:07 devillove084

I will try to implement the File index part and try to follow the implementation logic of the Java version and use the Rust style as much as possible. @Xuanwo

Thanks!

Xuanwo avatar Jul 29 '24 16:07 Xuanwo

Hi all, I'd like to help out with the project. I'm a bit new to rust. Let me know where I can lend a hand at!

ForeverAngry avatar Jul 31 '24 12:07 ForeverAngry

CC @Aitozi, do you have any suggestions regarding our current focus (reading the tables)? Which part do you think is the most important? Is it okay for us to ignore all the Index entries?

Xuanwo avatar Apr 20 '25 14:04 Xuanwo

CC @Aitozi, do you have any suggestions regarding our current focus (reviewing the tables)? Which part do you think is the most important? Is it okay for us to ignore all the Index entries?

IMO, 0.1.0 version is target to read the simple (No schema evolution, no merging, no deletion such as COW) paimon table through the paimon-rust. So, index is not the necessary part.

Aitozi avatar Apr 22 '25 02:04 Aitozi

What I believe is currently most needed is the ability to read paimon data normally (including add-on tables and primary key tables).

Pandas886 avatar May 12 '25 09:05 Pandas886