timescaledb-toolkit
timescaledb-toolkit copied to clipboard
move time_bucket_range to core
Problem Statement
Currently when bucketing time-intervals to ranges using time_bucket(), the resulting column store the start of the range, and the fact that this represents a time range is implicit, and has to be derived from the query. It would be nice to have a version that returns a range type, so it's more obvious what timespan is being covered.
Describe the solution you'd like
A time_bucket_range function that returns a range of whatever type is input that spans the time_bucket
-- if the original time_bucket() would return something like
SELECT time_bucket(time, '1 hour') FROM data;
time_bucket
------------------
2020-01-01 01:00
2020-01-01 02:00
2020-01-01 04:00
2020-01-01 06:00
2020-01-01 07:00
-- then time_bucket_range() could return something like
SELECT time_bucket_range(time, '1 hour') FROM data;
time_bucket_range
--------------------------------------
[2020-01-01 01:00, 2020-01-01 02:00)
[2020-01-01 02:00, 2020-01-01 03:00)
[2020-01-01 04:00, 2020-01-01 05:00)
[2020-01-01 06:00, 2020-01-01 07:00)
[2020-01-01 07:00, 2020-01-01 08:00)
create or replace function time_bucket_range( bucket_width interval, ts timestamptz) RETURNS tstzrange as $$
select tstzrange(time_bucket(bucket_width, ts), time_bucket(bucket_width, ts + bucket_width), '[)');
$$
LANGUAGE SQL IMMUTABLE PARALLEL SAFE;
Should do the trick for the simple case...
This needs communication with the core DB team; if we ever want to do the timebucket optimizations for this function as well, we'll really want it to be there.
@davidkohn88 would you take point on this?
This could really help in cases where you dynamically choose the aggregate (e.g. raw data, day, month) to not distribute that implicit interval all over the place if we could return ranges as suggested.