timescaledb icon indicating copy to clipboard operation
timescaledb copied to clipboard

More control over how Continuous Aggregate refreshes are scheduled

Open tyhoff opened this issue 2 years ago • 2 comments

We have 1 hyper table and we have 3 continuous aggregates on this one hyper table. We're noticing that if all of the 3 aggregate refreshes happen to run at once, the refresh process takes ~30-40 minutes and the CPU nears 80-90%, but if they all happen to run at different times, each takes 3-4 minutes and the CPU is at a happier 40% max.

I imagine there could be a global setting that prevents more than N refreshes of continuous aggregates to take place. The current alternative I see is to create our own "Job" which refreshes all three continuous aggregates sequentially, but this problem has to be seen by others as well.

Slack Context: https://timescaledb.slack.com/archives/C4GT3N90X/p1627651902008100

tyhoff avatar Aug 12 '21 21:08 tyhoff

Hi @tyhoff ! If you lower the number of background workers, it should not be possible for multiple jobs to start at the same time. It will generate warnings in the log though.

mkindahl avatar Aug 17 '21 12:08 mkindahl

I think this is kinda bug-ish, ideally we shouldn't require the user to work around our suboptimal scheduling. I imagine the slowdown is due to IO concurrency, so we could have a setting like "max simultaneous cagg updates per tablespace" and schedule accordingly.

akuzm avatar Oct 28 '21 10:10 akuzm