timescaledb icon indicating copy to clipboard operation
timescaledb copied to clipboard

Make Incremental CAgg Refresh Policy default

Open fabriziomello opened this issue 10 months ago • 1 comments

fabriziomello avatar Jun 13 '25 20:06 fabriziomello

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 81.99%. Comparing base (2f9c02c) to head (90dc70b). Report is 31 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8265      +/-   ##
==========================================
- Coverage   82.23%   81.99%   -0.25%     
==========================================
  Files         257      257              
  Lines       48776    48751      -25     
  Branches    12300    12297       -3     
==========================================
- Hits        40111    39973     -138     
- Misses       3745     3929     +184     
+ Partials     4920     4849      -71     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Jun 13 '25 21:06 codecov[bot]

There is always an overhead for each batch to process the batch, so with a batch size of 2 there will be more work needed to commit the same number of buckets. Do we have an benchmarks on what the impact on splitting it up into smaller batches are?

Did some local benchmarks /cc @philkra:

buckets_per_batch total_time
0 (old default) 00:00:57.838159
2 (new default) 00:01:53.10466 1.96x
5 00:01:43.674382 1.79x
10 00:01:38.617395 1.71x
20 00:01:29.420924 1.55x
100 00:01:04.206633 1.11x

fabriziomello avatar Jun 26 '25 20:06 fabriziomello

There is always an overhead for each batch to process the batch, so with a batch size of 2 there will be more work needed to commit the same number of buckets. Do we have an benchmarks on what the impact on splitting it up into smaller batches are?

Did some local benchmarks /cc @philkra:

buckets_per_batch total_time
0 (old default) 00:00:57.838159
2 (new default) 00:01:53.10466 1.96x
5 00:01:43.674382 1.79x
10 00:01:38.617395 1.71x
20 00:01:29.420924 1.55x
100 00:01:04.206633 1.11x

Hmm... this shows that commit time with the new default would be almost twice the old default. I am not sure that 0 is a good default (the old default), but picking a larger batch size for default might be better since users will typically start off with low traffic, hence less congestion on table locks, and smaller batch sizes are more useful for high traffic where locks are more contested.

mkindahl avatar Jun 30 '25 06:06 mkindahl

New benchmarks with work_mem=8MB (2x the default 4MB size).

buckets_per_batch total_time increase temp files temp bytes avg temp bytes
0 00:01:03.968151 1 101 MB 101 MB
1 00:01:45.048506 1.64 0 0 bytes 0 bytes
2 00:01:44.569175 1.63 333 1844 MB 5670 kB
5 00:01:36.170651 1.50 111 1541 MB 14 MB
10 00:01:33.843496 1.47 44 1227 MB 28 MB
20 00:01:32.31307 1.44 11 580 MB 53 MB
100 00:01:12.87014 1.14 7 76 MB 11 MB

fabriziomello avatar Jun 30 '25 18:06 fabriziomello