feat: Add comprehensive snapshot expiration functionality for table maintenance
Summary
This PR adds comprehensive(ish) (i think) snapshot expiration functionality to iceberg-rust, enabling automatic cleanup of old snapshots and associated files according to configurable retention policies.
Features Added
Core Functionality
-
Snapshot Expiration API: New
ExpireSnapshotsbuilder with fluent configuration -
Multiple Retention Strategies:
- Time-based expiration (
expire_older_than) - Count-based retention (
retain_last) - Combined criteria with precedence rules
- Time-based expiration (
-
Safety Guarantees:
- Never expire current snapshots
- Preserve snapshots referenced by branches/tags
- Atomic operations following Iceberg's commit model
- File Cleanup: Optional cleanup of orphaned manifest and data files
- Dry Run Mode: Preview functionality without making changes
API Example
// Expire snapshots older than 7 days, keeping at least 5 snapshots
let result = table.expire_snapshots()
.expire_older_than(chrono::Utc::now().timestamp_millis() - 7 * 24 * 60 * 60 * 1000)
.retain_last(5)
.clean_orphan_files(true)
.execute()
.await?;
println!("Expired {} snapshots", result.expired_snapshot_ids.len());
Files Added/Modified
New Files
-
iceberg-rust/src/table/maintenance/- New maintenance module -
iceberg-rust/src/table/maintenance/expire_snapshots.rs- Core implementation (1,007 lines) -
iceberg-rust/examples/expire_snapshots.rs- Usage examples -
iceberg-rust/tests/snapshot_expiration.rs- Comprehensive integration tests
Modified Files
-
iceberg-rust/src/table/mod.rs- Addedexpire_snapshots()method to Table -
iceberg-rust/src/lib.rs- Exposed maintenance module
Testing
Comprehensive Test Coverage
- 9 Unit Tests: Core functionality, edge cases, validation logic
- 6 Integration Tests: End-to-end scenarios with real table metadata
- All Tests Passing: ✅ 15/15 tests pass (I included my supporting tests as well)
Test Categories
- Time-based expiration logic
- Count-based retention logic
- Current snapshot protection
- Reference-aware cleanup
- Combined criteria precedence
- Empty metadata handling
- File cleanup identification
Implementation Details
Documentation
- Module-level documentation explaining concepts
- Some example (demo) code demonstrating common usage patterns
- Updated the main
readme.mdwith the new features.
Compatibility
- ✅ All existing tests pass
- ✅ No breaking changes to existing APIs
- ✅ Compatible with latest upstream changes (merged 62 commits)
This implementation provides a solid foundation for table maintenance operations in iceberg-rust, following Icebergs specification, and similar to that of the implementation from pyiceberg (that I also worked on 😮💨 ) in Rust!
@JanKaul let me know what you think :)
Wow, this is really cool stuff! Thanks a lot for your effort.
I think it would be best to integrate this with the current operation framework: https://github.com/JanKaul/iceberg-rust/blob/main/iceberg-rust/src/table/transaction/operation.rs#L105. You should be able to move the code from your execute method into the Operation::execute method.
And additionally use the TableUpdate::RemoveSnapshots as the Iceberg REST catalog already supports this.
@JanKaul take a look and let me know if those were the changes you were looking for or not!
Yes, the changes look great! Thank you
It looks like the implementations in the maintence module an in the ExpireSnashots operation are completely independent. I would like to not duplicate the code in repo and only use one implementation. And generally I would like to stick to the transaction logic that is present in the crate. Which means I would like to keep the login in operation.rs but not include the maintenance module.
Do you have any use cases were you would prefer to use the maintenance module?
hi @JanKaul , thanks for taking a look! This is my first Rust PR to an OOS repo, and i know it probably looks like a big ugly mess. I took me a good amount of time to put it together, and i changed directions a few times - so i figured there would be some duplicate code somewhere that i missed! Im happy to make any changes you want (feel free to make changes as well)!
To answer your question, the usecase i building to is to be able to use the maintenance feature(s) in a job running on using ray . Ideally, the next thing that I would want to add to the repo, would compaction for files, and manifest re-write. Since this would be where the power of your framework would really shine for my usecase.
Does that make sense? Let me know if you want to chat on discord, or google meet or otherwise!
@JanKaul pushed some updates, let me know what you think.
Sorry, I didn't have time to look into it. I will do in the coming days