ducklake
ducklake copied to clipboard
Introduce a OPTIMIZE command on table that rewrites data files to a target size
https://docs.databricks.com/aws/en/sql/language-manual/delta-optimize
can we add a utility similar to Delta optimize so that:
- small files are merged to the target size (this is already available as part of the table compaction maintenance)
- large files are split to the target size
- optimize runs over only the portion of the table parquet files relevant to the query
Would like to have something like this and we'll just have a background daily OPTIMIZE on the newly inserted data for that day.