timescaledb
timescaledb copied to clipboard
More control over how Continuous Aggregate refreshes are scheduled
We have 1 hyper table and we have 3 continuous aggregates on this one hyper table. We're noticing that if all of the 3 aggregate refreshes happen to run at once, the refresh process takes ~30-40 minutes and the CPU nears 80-90%, but if they all happen to run at different times, each takes 3-4 minutes and the CPU is at a happier 40% max.
I imagine there could be a global setting that prevents more than N refreshes of continuous aggregates to take place. The current alternative I see is to create our own "Job" which refreshes all three continuous aggregates sequentially, but this problem has to be seen by others as well.
Slack Context: https://timescaledb.slack.com/archives/C4GT3N90X/p1627651902008100
Hi @tyhoff ! If you lower the number of background workers, it should not be possible for multiple jobs to start at the same time. It will generate warnings in the log though.
I think this is kinda bug-ish, ideally we shouldn't require the user to work around our suboptimal scheduling. I imagine the slowdown is due to IO concurrency, so we could have a setting like "max simultaneous cagg updates per tablespace" and schedule accordingly.