timescaledb-toolkit
timescaledb-toolkit copied to clipboard
unexpected`delta`/`counter_agg` calculation on some `time_bucket` intervals
Relevant system information:
- OS: timescale/timescaledb-ha:pg13.4-ts2.4-latest
- PostgreSQL version (output of
SELECT version();
): PostgreSQL 13.4 (Debian 13.4-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit - TimescaleDB Toolkit version (output of
\dx timescaledb_toolkit
inpsql
): 1.1 - Installation method: docker
Describe the bug
Using toolkit_experimental.delta
and counter_agg
with time_bucket
with interval equal to 1 * sample interval or 2 * sample interval are incorrectly calculated:
- when
time_bucket
interval is equal 1 * sample interval it returns 0 - for 2 * sample interval it return smaller then expected values
To Reproduce
DROP TABLE IF EXISTS "octets";
CREATE TABLE IF NOT EXISTS "octets" (
"time" timestamp,
"in_octets" int8,
"out_octets" int8
);
SELECT create_hypertable("octets", "time");
INSERT INTO "octets" ("time", "in_octets", "out_octets") VALUES
('2021-09-09 18:20:00', 111, 221),
('2021-09-09 18:15:00', 101, 201),
('2021-09-09 18:10:00', 91, 181),
('2021-09-09 18:05:00', 81, 161),
('2021-09-09 18:00:00', 71, 141),
('2021-09-09 17:55:00', 61, 121),
('2021-09-09 17:50:00', 51, 101),
('2021-09-09 17:45:00', 41, 81),
('2021-09-09 17:40:00', 31, 61),
('2021-09-09 17:35:00', 21, 41),
('2021-09-09 17:30:00', 11, 21),
('2021-09-09 17:25:00', 1, 1);
Expected behavior first query:
SELECT
time_bucket ('5m'::interval, time) AS time,
toolkit_experimental.delta (toolkit_experimental.counter_agg (time, in_octets)) AS in,
toolkit_experimental.delta (toolkit_experimental.counter_agg (time, out_octets)) AS out
FROM octets
WHERE time BETWEEN '2021-09-09 17:25:00' AND '2021-09-09 18:20:00'
GROUP BY time_bucket('5 minutes'::interval, time)
ORDER BY time DESC;
should return:
2021-09-09 18:20:00 10 20
2021-09-09 18:15:00 10 20
2021-09-09 18:10:00 10 20
2021-09-09 18:05:00 10 20
2021-09-09 18:00:00 10 20
2021-09-09 17:55:00 10 20
2021-09-09 17:50:00 10 20
2021-09-09 17:45:00 10 20
2021-09-09 17:40:00 10 20
2021-09-09 17:35:00 10 20
2021-09-09 17:30:00 10 20
2021-09-09 17:25:00 0 0
second query:
SELECT
time_bucket ('10m'::interval, time) AS time,
toolkit_experimental.delta (toolkit_experimental.counter_agg (time, in_octets)) AS in,
toolkit_experimental.delta (toolkit_experimental.counter_agg (time, out_octets)) AS out
FROM octets
WHERE time BETWEEN '2021-09-09 17:25:00' AND '2021-09-09 18:20:00'
GROUP BY time_bucket ('10m'::interval, time)
ORDER BY time DESC;
should return:
2021-09-09 18:20:00 0 0
2021-09-09 18:10:00 20 40
2021-09-09 18:00:00 20 40
2021-09-09 17:50:00 20 40
2021-09-09 17:40:00 20 40
2021-09-09 17:30:00 20 40
2021-09-09 17:20:00 0 0
Actual behavior first query returns:
2021-09-09 18:20:00 0 0
2021-09-09 18:15:00 0 0
2021-09-09 18:10:00 0 0
2021-09-09 18:05:00 0 0
2021-09-09 18:00:00 0 0
2021-09-09 17:55:00 0 0
2021-09-09 17:50:00 0 0
2021-09-09 17:45:00 0 0
2021-09-09 17:40:00 0 0
2021-09-09 17:35:00 0 0
2021-09-09 17:30:00 0 0
2021-09-09 17:25:00 0 0
second query returns:
2021-09-09 18:20:00 0 0
2021-09-09 18:10:00 10 20
2021-09-09 18:00:00 10 20
2021-09-09 17:50:00 10 20
2021-09-09 17:40:00 10 20
2021-09-09 17:30:00 10 20
2021-09-09 17:20:00 0 0
This is the expected behavior for now, when you have a single value the delta within the bucket is zero. (or possibly it would be better to make it NULL? would that be preferred behavior?) With that said, there is an enhancement that we're looking into to use a window function so we can carry values forward / backward and then extend ranges to the end of the bucket. I'm going to open an issue or discussion around these sorts of lookahead/lookback functionality. Still we'd probably need to use some sort of extrapolated delta in order to make this work correctly...