ArcticDB
ArcticDB copied to clipboard
EPIC: Defragmentation improvements
Known remaining work. Replace checkbox text with link to individual issues when they are being actively worked on.
- [ ] Support column slicing, and re-slicing to a new columns-per-segment as well.
- [ ] API is error-prone, redesign into easier to understand format. All we really need is rows-per-segment and columns-per-segment (defaulting to lib config options), and "tolerances" of how far from the ideal rows/columns per segment output segments are allowed to be.
- [ ] defragment_symbol_data shouldn't raise when is_symbol_fragmented returns false. Should indicate whether work was done through returned object through
Optional[VersionedItem]
- [ ] Never add symbol list entry - only live symbols can be defragmented.
- [ ] Possible rename (reslice? retile?) as we will support changing the column-slicing as well.
- [ ] Handle dynamic schema:
- [ ] Missing columns to be sparsified
- [ ] Changing types should be promoted
- [ ] Handle empty type segments by sparsifying column
Out of scope: Compacting versions other than the latest. We will only support compacting the latest version, and this will always produce a new version.