kvrocks icon indicating copy to clipboard operation
kvrocks copied to clipboard

TimeSeries: Support `twa` aggregator for range queries

Open yezhizi opened this issue 2 months ago • 11 comments

Search before asking

  • [x] I had searched in the issues and found no similar issues.

Motivation

When using TS.RANGE/TS.MRANGE for range queries, it supports specifying twa (time-weighted average) as the aggregator

Redis reference: https://redis.io/docs/latest/commands/ts.range/ Twa details

Solution

Refer to others aggregator like sum, avg Refer to this code snippets

Are you willing to submit a PR?

  • [ ] I'm willing to submit a PR!

yezhizi avatar Oct 05 '25 06:10 yezhizi

I can work on this.

var-nan avatar Oct 27 '25 16:10 var-nan

Hi @var-nan. The twa aggregation might not be very clear in the Redis documentation, so I've drew up a quick diagram for the following example. Hope this helps a bit!

Examples in redis(4 samples):

127.0.0.1:6379> ts.range test11 - +
1) 1) (integer) 12
   2) 12
2) 1) (integer) 15
   2) 15
3) 1) (integer) 19
   2) 19
4) 1) (integer) 22
   2) 22
127.0.0.1:6379> ts.range test11 - + aggregation twa 10
1) 1) (integer) 10
   2) 16
2) 1) (integer) 20
   2) 21

Image

Note: right/left refers to the boundary timestamps of the bucket. first/last refers to the timestamp of the first/last sample within this bucket.

yezhizi avatar Nov 03 '25 08:11 yezhizi

Hi @yezhizi, Thanks. This diagram was helpful. I misunderstood the TWA and was doing Last Observation Carried Forward (LCOF) instead of computing the area under the curve.

var-nan avatar Nov 04 '25 03:11 var-nan

Hi @yezhizi, I'm having trouble implementing an edge case. From the above example, say, if this is the input, ts.range test11 13 20 aggregation twa 10. The twa needs information about 12 and 22 to compute the area from 13 to 15 and from 19 to 20. However, the ts.range filters out the data (from 13 to 20) prior to aggregation.

I can try modifying the TimeSeries::rangeCommon() to include the prev and next values in the res vector. Is there any other way to do this?

var-nan avatar Nov 09 '25 07:11 var-nan

The twa needs information about 12 and 22 to compute the area from 13 to 15 and from 19 to 20. However, the ts.range filters out the data (from 13 to 20) prior to aggregation.

I can try modifying the TimeSeries::rangeCommon() to include the prev and next values in the res vector. Is there any other way to do this?

Sorry for the delay. I understand this issue. Modifying TimeSeries::rangeCommon() to to additionally retrieve prev and next for TWA aggregator seems like a good solution.

yezhizi avatar Nov 09 '25 12:11 yezhizi

@var-nan Another noteworthy point is that when FILTER_BY_TS/FILTER_BY_VALUE is used togather with AGGREGATION, the samples are filtered before being aggregated. However, RedisTimeSeries might return unexpected results in such cases. I've already reported this issue to RedisTimeSeries, and they are currently testing and working on a fix.

If necessary, we can discuss the correctness of the test cases during testing. Refer to https://github.com/RedisTimeSeries/RedisTimeSeries/issues/1811

yezhizi avatar Nov 10 '25 02:11 yezhizi

Hi @yezhizi , I have a quick question about FILTER_BY_TS option.

Image

does the above answer makes senseif we filter timestamps [24,30,35] with FILTER_BY_TS option? I just want to make sure that I understood TWA with FILTER_BY_TS before I code it.

var-nan avatar Nov 16 '25 21:11 var-nan

does the above answer makes senseif we filter timestamps [24,30,35] with FILTER_BY_TS option? I just want to make sure that I understood TWA with FILTER_BY_TS before I code it.

Hi @var-nan ! I think this is reasonable. Maybe we can treat the samples after FILTER_BY_TS/FILTER_BY_VALUE as the 'complete series', and no longer consider out-of-boundary samples (like the prev and next you mentioned earlier). This makes sense and is easy to implement, even if it may not completely align with the results of redistimeseies. Since it is difficult for us to use redistimeseies to calibrate our results(for the reasons mentioned above), we can just move forward and refine the details in the future.

yezhizi avatar Nov 17 '25 07:11 yezhizi

Sounds good. I just pushed new commit, that explicitly ignores the prev_sample and next_sample values when values are filtered with FILTER_BY_TS/FILTER_BY_VALUE.

var-nan avatar Nov 17 '25 18:11 var-nan

Hi @yezhizi. I have a question. What happens to the result when we specify EMPTY flag, and there are no samples in the filtered range before aggregation?

For example, consider there exists only two samples in the series sample_1 = {10,34}, sample_2 = {95,34}. And ts.range series 20 90 AGGREGATION TWA 10 EMPTY

Since there are no samples in the filtered region, should it return a single NaN or should it return TWA for all empty buckets 20-30, 30-40, 40-50,......80-90. In this case all the TWA is same for all the buckets.

The Redis documentation doesn't clarify this case.

var-nan avatar Nov 24 '25 04:11 var-nan

Since there are no samples in the filtered region, should it return a single NaN or should it return TWA for all empty buckets 20-30, 30-40, 40-50,......80-90. In this case all the TWA is same for all the buckets.

It should return TWA for all empty buckets, which is different from other aggregators.

127.0.0.1:6379> ts.range test1 - +
1) 1) (integer) 10
   2) 34
2) 1) (integer) 95
   2) 34
127.0.0.1:6379> ts.range test1 20 90 AGGREGATION TWA 10 EMPTY
1) 1) (integer) 20
   2) 34
2) 1) (integer) 30
   2) 34
3) 1) (integer) 40
   2) 34
4) 1) (integer) 50
   2) 34
5) 1) (integer) 60
   2) 34
6) 1) (integer) 70
   2) 34
7) 1) (integer) 80
   2) 34
8) 1) (integer) 90
   2) 34
127.0.0.1:6379> ts.range test1 20 90 AGGREGATION SUM 10 EMPTY
(empty array)

yezhizi avatar Nov 24 '25 10:11 yezhizi