timescaledb-toolkit
timescaledb-toolkit copied to clipboard
Workaround for continuous aggregates not supporting time_bucket_gapfill
Relevant system information:
- PostgreSQL 14.6 (Ubuntu 14.6-1.pgdg22.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0, 64-bit
- TimescaleDB Toolkit version: 2.8.1
- Installation method: Timescale Cloud
Describe the bug
I receive the error message "Invalid continuous aggregate view" when trying to create a continuous aggregate that uses time_bucket_gapfill.
To Reproduce
Table Definition:
CREATE TABLE
public.usage (
session_id bigint NOT NULL,
sim_profile_id bigint NOT NULL,
sim_card_id character varying(43) NULL,
org_id integer NULL,
event_time timestamp without time zone NOT NULL,
is_connected smallint NOT NULL,
n_seconds double precision NOT NULL,
zero_byte_session boolean NOT NULL,
bytes bigint NOT NULL,
bytes_per_second double precision NOT NULL,
tadig character varying(5) NULL,
country_iso3 character varying(3) NULL,
rat character varying(10) NULL,
lac bigint NULL,
cid bigint NULL
);
Query I would like to store as a continuous aggregate view:
WITH
t1 as(
SELECT
sim_card_id,
time_bucket_gapfill('10 minute', event_time) AS time,
locf(last(bytes_per_second, event_time)) as locf_value,
last(bytes_per_second, event_time) as indicator_empty_bucket,
time_weight('locf', event_time, bytes_per_second) AS bps
FROM public.usage
Where event_time BETWEEN '2021-12-01 00:00' AND '2022-01-01 00:00'
GROUP BY 1,2),
t2 as(
SELECT sim_card_id, time, time_weight('locf', time, locf_value) as bps
FROM t1
WHERE indicator_empty_bucket IS NULL
GROUP BY 1, 2),
t3 as(
SELECT sim_card_id, time, bps FROM t1 WHERE indicator_empty_bucket IS NOT NULL
UNION ALL
SELECT sim_card_id, time, bps FROM t2)
SELECT
sim_card_id,
time,
toolkit_experimental.interpolated_integral(
bps,
time,
'10 minute',
LAG(bps) OVER (Partition by sim_card_id ORDER BY time),
LEAD(bps) OVER (Partition by sim_card_id ORDER BY time),
'seconds'
) as bytes
FROM t3
Additional context Related issue with this query: https://github.com/timescale/timescaledb-toolkit/issues/548
Thanks @tlarrue , this is expected behavior, but it'd be great to have gapfill support in continuous aggregates. I'm going to add a workaround here soon...it doesn't fix everything, but it will allow you to use continuous aggregates and we'll work on something better in the future.
Hey @davidkohn88, any updates on your side? :smile:
The workaround unfortunately ended up being more complicated than we'd hoped and introduced some really hairy corner cases. At this point I'm afraid we don't have a good alternative to waiting for the fix to the underlying issue: https://github.com/timescale/timescaledb/issues/1324