TimeSeries: Support `twa` aggregator for range queries
Search before asking
- [x] I had searched in the issues and found no similar issues.
Motivation
When using TS.RANGE/TS.MRANGE for range queries, it supports specifying twa (time-weighted average) as the aggregator
Redis reference: https://redis.io/docs/latest/commands/ts.range/ Twa details
Solution
Refer to others aggregator like sum, avg
Refer to this code snippets
Are you willing to submit a PR?
- [ ] I'm willing to submit a PR!
I can work on this.
Hi @var-nan. The twa aggregation might not be very clear in the Redis documentation, so I've drew up a quick diagram for the following example. Hope this helps a bit!
Examples in redis(4 samples):
127.0.0.1:6379> ts.range test11 - +
1) 1) (integer) 12
2) 12
2) 1) (integer) 15
2) 15
3) 1) (integer) 19
2) 19
4) 1) (integer) 22
2) 22
127.0.0.1:6379> ts.range test11 - + aggregation twa 10
1) 1) (integer) 10
2) 16
2) 1) (integer) 20
2) 21
Note: right/left refers to the boundary timestamps of the bucket. first/last refers to the timestamp of the first/last sample within this bucket.
Hi @yezhizi, Thanks. This diagram was helpful. I misunderstood the TWA and was doing Last Observation Carried Forward (LCOF) instead of computing the area under the curve.
Hi @yezhizi, I'm having trouble implementing an edge case. From the above example, say, if this is the input, ts.range test11 13 20 aggregation twa 10. The twa needs information about 12 and 22 to compute the area from 13 to 15 and from 19 to 20. However, the ts.range filters out the data (from 13 to 20) prior to aggregation.
I can try modifying the TimeSeries::rangeCommon() to include the prev and next values in the res vector. Is there any other way to do this?
The twa needs information about 12 and 22 to compute the area from 13 to 15 and from 19 to 20. However, the
ts.rangefilters out the data (from 13 to 20) prior to aggregation.I can try modifying the
TimeSeries::rangeCommon()to include theprevandnextvalues in theresvector. Is there any other way to do this?
Sorry for the delay. I understand this issue. Modifying TimeSeries::rangeCommon() to to additionally retrieve prev and next for TWA aggregator seems like a good solution.
@var-nan Another noteworthy point is that when FILTER_BY_TS/FILTER_BY_VALUE is used togather with AGGREGATION, the samples are filtered before being aggregated. However, RedisTimeSeries might return unexpected results in such cases. I've already reported this issue to RedisTimeSeries, and they are currently testing and working on a fix.
If necessary, we can discuss the correctness of the test cases during testing. Refer to https://github.com/RedisTimeSeries/RedisTimeSeries/issues/1811
Hi @yezhizi , I have a quick question about FILTER_BY_TS option.
does the above answer makes senseif we filter timestamps [24,30,35] with FILTER_BY_TS option? I just want to make sure that I understood TWA with FILTER_BY_TS before I code it.
does the above answer makes senseif we filter timestamps
[24,30,35]withFILTER_BY_TSoption? I just want to make sure that I understood TWA with FILTER_BY_TS before I code it.
Hi @var-nan ! I think this is reasonable. Maybe we can treat the samples after FILTER_BY_TS/FILTER_BY_VALUE as the 'complete series', and no longer consider out-of-boundary samples (like the prev and next you mentioned earlier). This makes sense and is easy to implement, even if it may not completely align with the results of redistimeseies. Since it is difficult for us to use redistimeseies to calibrate our results(for the reasons mentioned above), we can just move forward and refine the details in the future.
Sounds good. I just pushed new commit, that explicitly ignores the prev_sample and next_sample values when values are filtered with FILTER_BY_TS/FILTER_BY_VALUE.
Hi @yezhizi. I have a question. What happens to the result when we specify EMPTY flag, and there are no samples in the filtered range before aggregation?
For example, consider there exists only two samples in the series sample_1 = {10,34}, sample_2 = {95,34}. And ts.range series 20 90 AGGREGATION TWA 10 EMPTY
Since there are no samples in the filtered region, should it return a single NaN or should it return TWA for all empty buckets 20-30, 30-40, 40-50,......80-90. In this case all the TWA is same for all the buckets.
The Redis documentation doesn't clarify this case.
Since there are no samples in the filtered region, should it return a single
NaNor should it return TWA for all empty buckets20-30, 30-40, 40-50,......80-90. In this case all the TWA is same for all the buckets.
It should return TWA for all empty buckets, which is different from other aggregators.
127.0.0.1:6379> ts.range test1 - +
1) 1) (integer) 10
2) 34
2) 1) (integer) 95
2) 34
127.0.0.1:6379> ts.range test1 20 90 AGGREGATION TWA 10 EMPTY
1) 1) (integer) 20
2) 34
2) 1) (integer) 30
2) 34
3) 1) (integer) 40
2) 34
4) 1) (integer) 50
2) 34
5) 1) (integer) 60
2) 34
6) 1) (integer) 70
2) 34
7) 1) (integer) 80
2) 34
8) 1) (integer) 90
2) 34
127.0.0.1:6379> ts.range test1 20 90 AGGREGATION SUM 10 EMPTY
(empty array)