timescaledb-toolkit icon indicating copy to clipboard operation
timescaledb-toolkit copied to clipboard

Wrong result when using duration_in with state_agg functions with small query windows

Open larseen opened this issue 2 years ago • 3 comments
trafficstars

What type of bug is this?

Incorrect result

What subsystems and features are affected?

Query executor

What happened?

The issue stems from our pattern where we have entities that can be in a certain states for hours, and queries that want to know how much a certain entity was in a state for minutes

When using the experimental functions like duration_in and state_agg to get the duration of when a timescale table is in a certain state.

It works great if one is not narrowing down the query window, or filtering to only look between certain periods, this is probably due to it not being any events in the given period I am querying in.

I have added a diagram of explaining some issues im seeing

SELECT toolkit_experimental.duration_in('state 2', toolkit_experimental.state_agg(time, state)) FROM machines WHERE machine_id = '123'; this works fine, as there is no time range constraints

However

If asking for how long state-1 was active in the period between query window-1 one will get 0 If asking for how long state-0 was active in period between query window-2 one will get 0

duration-in (1)

Link to slack thread originally discussing the problem https://timescaledb.slack.com/archives/C4GT3N90X/p1669298851582689

TimescaleDB version affected

2.8.1

PostgreSQL version used

14.6

What operating system did you use?

Mac OS X 10.5 ARM

What installation method did you use?

Docker

What platform did you run on?

Timescale Cloud

Relevant log output and stack trace

No response

How can we reproduce the bug?

Use toolkit_experimental.duration_in toolkit_experimental.state_agg where the original query is not seeing any "states" or might not see the initial state.

larseen avatar Dec 12 '22 17:12 larseen

I think this is not a bug, but an enhancement request, with some potential workarounds in the meantime. I'll post some workarounds and also I think there's some possible enhancements that would help this.

davidkohn88 avatar Dec 12 '22 18:12 davidkohn88

Any update on this? I feel like this is kind of the main point of state agg. I feel like interpolated_duration_in solution for this

KamalAman avatar Oct 23 '23 23:10 KamalAman

@davidkohn88 any workararounds as this quite related to https://github.com/timescale/timescaledb/issues/1324

I've understood state_agg only calculates deltas within the bucket. So original post makes sense. However seems to me that even if state 0 is switched on at time 0 and there's never a change to another state in the bucket - duration_in will report 0 although it has been the whole bucket in that state.

leppaott avatar Oct 24 '23 11:10 leppaott