timescaledb icon indicating copy to clipboard operation
timescaledb copied to clipboard

[Enhancement]: Compression job should process chunks in order of range_start

Open RobAtticus opened this issue 11 months ago • 3 comments

What type of enhancement is this?

User experience

What subsystems and features will be improved?

Compression

What does the enhancement do?

The compression job should process the chunks in order of their range_start so that the experimental rollup functionality is more effective. Without an order, it's possible for chunks to processed in an order that prevent full rollups from being done, because it may start rolling up a chunk "later" in timeline, then go back in the timeline, but now that partially rolled up chunk is too large to rollup into the one further back.

Implementation challenges

No response

RobAtticus avatar Mar 08 '24 20:03 RobAtticus

@RobAtticus the current show_chunks logic uses the hypertable_id and table_id numbering values to do the sorting of the returned chunks. Typically, if we consider append only data insertions then that should be in sync with the time ranges.

We could return the chunks in dimension slice order though

nikkhils avatar Mar 25 '24 08:03 nikkhils

Is show_chunks used as part of the compression policy job? Basically what I've found is that sometimes the compression job will skip around in the set of chunks to be compressed, which leads to inefficient rollups. So this issue was about that, although I also think show_chunks should enforce dimension slice order rather than rely on the IDs (given backfills, untiering a chunk, etc)

RobAtticus avatar Mar 25 '24 15:03 RobAtticus

@RobAtticus yeah, show_chunks is used in the compression policy logic.

yeah, maybe dimension_slice based sorting is the way to go. We will need documentation changes also if we go this route.

nikkhils avatar Mar 26 '24 14:03 nikkhils