Make Incremental CAgg Refresh Policy default
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 81.99%. Comparing base (
2f9c02c) to head (90dc70b). Report is 31 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #8265 +/- ##
==========================================
- Coverage 82.23% 81.99% -0.25%
==========================================
Files 257 257
Lines 48776 48751 -25
Branches 12300 12297 -3
==========================================
- Hits 40111 39973 -138
- Misses 3745 3929 +184
+ Partials 4920 4849 -71
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
There is always an overhead for each batch to process the batch, so with a batch size of 2 there will be more work needed to commit the same number of buckets. Do we have an benchmarks on what the impact on splitting it up into smaller batches are?
Did some local benchmarks /cc @philkra:
| buckets_per_batch | total_time | |
|---|---|---|
| 0 (old default) | 00:00:57.838159 | |
| 2 (new default) | 00:01:53.10466 | 1.96x |
| 5 | 00:01:43.674382 | 1.79x |
| 10 | 00:01:38.617395 | 1.71x |
| 20 | 00:01:29.420924 | 1.55x |
| 100 | 00:01:04.206633 | 1.11x |
There is always an overhead for each batch to process the batch, so with a batch size of 2 there will be more work needed to commit the same number of buckets. Do we have an benchmarks on what the impact on splitting it up into smaller batches are?
Did some local benchmarks /cc @philkra:
buckets_per_batch total_time 0 (old default) 00:00:57.838159 2 (new default) 00:01:53.10466 1.96x 5 00:01:43.674382 1.79x 10 00:01:38.617395 1.71x 20 00:01:29.420924 1.55x 100 00:01:04.206633 1.11x
Hmm... this shows that commit time with the new default would be almost twice the old default. I am not sure that 0 is a good default (the old default), but picking a larger batch size for default might be better since users will typically start off with low traffic, hence less congestion on table locks, and smaller batch sizes are more useful for high traffic where locks are more contested.
New benchmarks with work_mem=8MB (2x the default 4MB size).
| buckets_per_batch | total_time | increase | temp files | temp bytes | avg temp bytes |
|---|---|---|---|---|---|
| 0 | 00:01:03.968151 | 1 | 101 MB | 101 MB | |
| 1 | 00:01:45.048506 | 1.64 | 0 | 0 bytes | 0 bytes |
| 2 | 00:01:44.569175 | 1.63 | 333 | 1844 MB | 5670 kB |
| 5 | 00:01:36.170651 | 1.50 | 111 | 1541 MB | 14 MB |
| 10 | 00:01:33.843496 | 1.47 | 44 | 1227 MB | 28 MB |
| 20 | 00:01:32.31307 | 1.44 | 11 | 580 MB | 53 MB |
| 100 | 00:01:12.87014 | 1.14 | 7 | 76 MB | 11 MB |