[discuss] PyIceberg Near-Term Roadmap
Feature Request / Improvement
This issue tracks some of areas of focus for the pyiceberg project in the near term.
The previous roadmap (#736) was created right before Iceberg Summit 2024. The community has done a lot of work since then; see 0.7.0 release notes, 0.8.0 release notes, and 0.9.0 release notes.
The roadmap will evolve depending on community contribution. Please feel free to discuss and add more.
Areas of focus
- V3 support (#1818)
- Table maintenance features (#1065 / #31)
- Documentation
- Connection to various catalog implementations
- Partitioned write
- pyiceberg-core (iceberg-rust integration)
- Integration with other engines
#1200 Already working on orphan file. Soon I will create a PR
@kevinjqliu if no-one has picked up documentation tasks I would like to help.
Some loose ideas in terms of any rust integration:
Fancy CI work to enable testing python bindings from the rust repo directly against tests from this one (when/where it makes sense).
py-spy to truly show the performance impacts when using the python bindings from the rust repo
With 0.10 release soon and the upcoming community brainstorm session (https://github.com/apache/iceberg-python/issues/2260), lets close this for now
A lot of great work has been merged in already.
- Many parts of the V3 support (V3 Tracking issue #1818)
- First table maintenance feature, expire snapshots
- Integration with iceberg-rust
- Integration with datafusion