lance
lance copied to clipboard
feat: allow compacting files into new format version
bootleg parallel migration tool using compaction task execution facility
In next PR, I'll add a force_rewrite option to rewrite files even when the file size is equal to the desired number of rows/size
Codecov Report
Attention: Patch coverage is 25.00000% with 3 lines in your changes missing coverage. Please review.
Project coverage is 79.15%. Comparing base (
7284521) to head (430bd81).
| Files | Patch % | Lines |
|---|---|---|
| rust/lance/src/dataset/optimize.rs | 25.00% | 1 Missing and 2 partials :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #2749 +/- ##
==========================================
+ Coverage 79.13% 79.15% +0.01%
==========================================
Files 227 227
Lines 67398 67402 +4
Branches 67398 67402 +4
==========================================
+ Hits 53338 53353 +15
+ Misses 10956 10948 -8
+ Partials 3104 3101 -3
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 79.15% <25.00%> (+0.01%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
We want to avoid datasets that are split between two different versions (e.g. some files in v1 and some in v2). I think this approach can cause that situation if only some files need compacted? That could be potentially dangerous (although I think the write would fail in this situation)
I see. Let me make this option implicitly rewrite the whole dataset then.
Thank you for your contribution. This PR has been inactive for a while, so we're closing it to free up bandwidth. Feel free to reopen it if you still find it useful.