timescaledb
timescaledb copied to clipboard
[Enhancement]: Allow partitioning in Continuous-Aggregates and refreshing specific partitions
What type of enhancement is this?
API improvement
What subsystems and features will be improved?
Continuous aggregate, Partitioning
What does the enhancement do?
In our system, we receive data from multiple devices (~200) and each device has a variable amount of historical data stored in a hypertable. However, it happens somewhat frequently that the data from these devices are wrong, and some corrections/updates happens in the base hypertable.
Whenever this happens, we need to refresh all our continuous-aggregates to pick up those updates. The only way to do this currently is to call the refresh_continuous_aggregate
method in a time interval that covers the history of these changes, however by doing so, we are also re-computing all the others device at the same time, which makes the process way longer than it needs to.
Example of a Continuous-aggregate
CREATE MATERIALIZED VIEW active_users
WITH (timescaledb.continuous, timescaledb.materialized_only = true)
as
select
deviceid,
time_bucket('1 day', bucket) as date,
manufactorid,
countrycode,
count(distinct userid),
from devices_usage
group by
deviceid,
date,
manufactorid,
countrycode
order by date desc
WITH NO data;
Where, for example, deviceid = 'abc' needs to be refreshed because of changes in devices_usage
.
Currently we would do this:
refresh_continuous_aggregate('devices_usage', '2018-01-01', '2023-11-01')
With this feature implemented, we hopefully be able to do something like this:
refresh_continuous_aggregate('devices_usage', '2018-01-01', '2023-11-01', 'abc')
Implementation challenges
Since hypertables already supports partitioning, and continuous-aggregates are also hypertables, it would be interesting if we could add partitions to continuous-aggregates and also extend the refresh_continuous_aggregate
API method to also refresh a specific partition.
In my team, we have a similar use case and such a feature would be a great improvement for the timescale.
It seems to be a duplicate of #6008 ([Enhancement]: Refresh continuous aggregate with filter) as well as #6148 (Continuous Aggregates: support for secondary dimension)