timescaledb-toolkit
timescaledb-toolkit copied to clipboard
Wrong result when using duration_in with state_agg functions with small query windows
What type of bug is this?
Incorrect result
What subsystems and features are affected?
Query executor
What happened?
The issue stems from our pattern where we have entities that can be in a certain states for hours, and queries that want to know how much a certain entity was in a state for minutes
When using the experimental functions like duration_in and state_agg to get the duration of when a timescale table is in a certain state.
It works great if one is not narrowing down the query window, or filtering to only look between certain periods, this is probably due to it not being any events in the given period I am querying in.
I have added a diagram of explaining some issues im seeing
SELECT toolkit_experimental.duration_in('state 2', toolkit_experimental.state_agg(time, state)) FROM machines WHERE machine_id = '123'; this works fine, as there is no time range constraints
However
If asking for how long state-1 was active in the period between query window-1 one will get 0 If asking for how long state-0 was active in period between query window-2 one will get 0

Link to slack thread originally discussing the problem https://timescaledb.slack.com/archives/C4GT3N90X/p1669298851582689
TimescaleDB version affected
2.8.1
PostgreSQL version used
14.6
What operating system did you use?
Mac OS X 10.5 ARM
What installation method did you use?
Docker
What platform did you run on?
Timescale Cloud
Relevant log output and stack trace
No response
How can we reproduce the bug?
Use toolkit_experimental.duration_in toolkit_experimental.state_agg where the original query is not seeing any "states" or might not see the initial state.
I think this is not a bug, but an enhancement request, with some potential workarounds in the meantime. I'll post some workarounds and also I think there's some possible enhancements that would help this.
Any update on this? I feel like this is kind of the main point of state agg. I feel like interpolated_duration_in solution for this
@davidkohn88 any workararounds as this quite related to https://github.com/timescale/timescaledb/issues/1324
I've understood state_agg only calculates deltas within the bucket. So original post makes sense. However seems to me that even if state 0 is switched on at time 0 and there's never a change to another state in the bucket - duration_in will report 0 although it has been the whole bucket in that state.