frostdb L1 arrow compaction

It may be useful to have the option to compact L0 arrow records into L1 arrow records instead of Parquet.

Apr 27 '23 15:04 thorfour

This may only be worth pursuing once the REE support changes are in FrostDB as well as the record sorting implementation https://github.com/apache/arrow/pull/34719 is completed

May 01 '23 21:05 thorfour

Agreed. I think moving to arrow-only in-mem would be the last step in this quarter.

May 02 '23 06:05 asubiotto

I am thinking about this, I was wondering if this is the same as arrowutils.MergeRecords(arrow_parts...) |> arrowutils.SortRecord |> parts.NewArrowPart ?

Dec 18 '23 01:12 gernest

Yes, although given the arrow parts should be merged on input, there probably isn't a need for the downstream sort. I'd also be interested in getting some L0 to L1 stats on how much memory we reduce through arrow compaction vs parquet compaction.

Dec 18 '23 09:12 asubiotto

@asubiotto can you expand a bit about memory expectation between arrow/parquet compaction ?

I was always under the impression parquet+compression gives better memory saving than arrow.

Dec 20 '23 04:12 gernest

Yes, this is why I'd be interested in getting some numbers so we are informed about the tradeoffs. Intuitively, dictionary encoding should go a long way. We've also been thinking about experimenting with run end encoding in arrow.

Dec 20 '23 08:12 asubiotto

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Jan 20 '24 01:01 github-actions[bot]

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Feb 20 '24 01:02 github-actions[bot]

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Mar 22 '24 01:03 github-actions[bot]

I think it's still useful to keep this open.

Apr 15 '24 06:04 asubiotto

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

May 16 '24 01:05 github-actions[bot]

frostdb frostdb copied to clipboard

L1 arrow compaction

frostdb
frostdb copied to clipboard