paimon-rust
paimon-rust copied to clipboard
Tracking issues of 0.1.0 version for Apache Paimon Rust
Hello everyone, this is the tracking update for the paimon rust 0.1.0 release.
Goal
Before we outline the tasks for the 0.1.0 release, let me clarify the project's goal:
Developing a complete implementation of Paimon in pure Rust.
- Users can read/write paimon table like they do in java API.
- Users can read/write paimon table in arrow format.
- Native support for DataFusion
- But also enable users to implement their own query engines based on this project.
- Native WASM support (a.k.a
paimon-wasm) - Native Python binding based on rust core (a.k.a
paimon-py) - Hive catalog support
Tasks
This will be our initial release, and I aim to include basic read support in it.
- Spec: Implement types that needed by paimon.
- [x] https://github.com/apache/paimon-rust/pull/5
- [ ] Datatypes
- [x] https://github.com/apache/paimon-rust/issues/6
- [x] https://github.com/apache/paimon-rust/issues/14
- [ ] https://github.com/apache/paimon-rust/issues/7
- [x] https://github.com/apache/paimon-rust/issues/10
- [ ] Global Index
- [ ] Data File Index
- Catalog
- [ ] https://github.com/apache/paimon-rust/pull/62
- [ ] https://github.com/apache/paimon-rust/issues/70 (leave hive catalog in next release)
- Arrow Integration:
- [ ] https://github.com/apache/paimon-rust/issues/71
- Basic Read Process: No schema evolution, no merging, no deletion.
- [ ] https://github.com/apache/paimon-rust/issues/72 (filter & project push down are not required in v0.1)
- IO: Integrate with Apache OpenDAL for IO.
- Release Utils: Impelment scripts to help generate and verify ASF releases.
After all those tasks, I expect users can read an existing paimon table from storage services.
@Xuanwo I will do the Snapshot task.
@Xuanwo I will do the
Snapshottask.
Welcome, have fun!
hello @Xuanwo , I would like to work on implementing Manifest.
hello @Xuanwo , I would like to work on implementing
Manifest.
Great, have fun!
Hey, I'm new around here and this project sounds interesting. Can I jump in and help out?
Manifest
Thanks a lot, welcome to pick up a part. I'm willing to update tracking issues for you.
Manifest
Thanks a lot, welcome to pick up a part. I'm willing to update tracking issues for you.
okay, let me try data file
I want to try the manifest list
Cool, it's good project to be familiar with rust and big data infrastructure😆
I will do the Changelog task, and the corresponding issue have been created #16.
Cool, it's good project to be familiar with rust and big data infrastructure😆
Welcome!
I will do the
Changelogtask, and the corresponding issue have been created #16.
Updated.
Hi, I am a big fan of rust, and I usually do data lake-related development. Be happy to join the development of paimon-rust in the future.
Does the FileSystemCatalog rely on the Integrate with Apache OpenDAL ? There are some IO operation in the FileSystemCatalog. @Xuanwo
Does the FileSystemCatalog rely on the Integrate with Apache OpenDAL ?
Yes, my plan is using opendal for those IO operations.
I will try to implement the File index part and try to follow the implementation logic of the Java version and use the Rust style as much as possible. @Xuanwo
I will try to implement the File index part and try to follow the implementation logic of the Java version and use the Rust style as much as possible. @Xuanwo
Thanks!
Hi all, I'd like to help out with the project. I'm a bit new to rust. Let me know where I can lend a hand at!
CC @Aitozi, do you have any suggestions regarding our current focus (reading the tables)? Which part do you think is the most important? Is it okay for us to ignore all the Index entries?
CC @Aitozi, do you have any suggestions regarding our current focus (reviewing the tables)? Which part do you think is the most important? Is it okay for us to ignore all the
Indexentries?
IMO, 0.1.0 version is target to read the simple (No schema evolution, no merging, no deletion such as COW) paimon table through the paimon-rust. So, index is not the necessary part.
What I believe is currently most needed is the ability to read paimon data normally (including add-on tables and primary key tables).