ArcticDB icon indicating copy to clipboard operation
ArcticDB copied to clipboard

EPIC: Defragmentation improvements

Open alexowens90 opened this issue 1 year ago • 0 comments

Known remaining work. Replace checkbox text with link to individual issues when they are being actively worked on.

  • [ ] Support column slicing, and re-slicing to a new columns-per-segment as well.
  • [ ] API is error-prone, redesign into easier to understand format. All we really need is rows-per-segment and columns-per-segment (defaulting to lib config options), and "tolerances" of how far from the ideal rows/columns per segment output segments are allowed to be.
  • [ ] defragment_symbol_data shouldn't raise when is_symbol_fragmented returns false. Should indicate whether work was done through returned object through Optional[VersionedItem]
  • [ ] Never add symbol list entry - only live symbols can be defragmented.
  • [ ] Possible rename (reslice? retile?) as we will support changing the column-slicing as well.
  • [ ] Handle dynamic schema:
    • [ ] Missing columns to be sparsified
    • [ ] Changing types should be promoted
  • [ ] Handle empty type segments by sparsifying column

Out of scope: Compacting versions other than the latest. We will only support compacting the latest version, and this will always produce a new version.

alexowens90 avatar Aug 11 '23 14:08 alexowens90