heroic
heroic copied to clipboard
Aggregating buckets include datapoints from before requested data range
Hi!
When aggregating, heroic is setting bucket time as a next bucket start time.
This can be reproduced with the following sequence:
- start
heroicdocker image:docker run --rm -p 8080:8080 -p 9091:9091 spotify/heroic - index the following data points:
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/write --data-binary @- << EOF
{
"series": {
"key": "foo",
"tags": {
"foo": "bar"
}
},
"data": {
"type": "points",
"data": [
[1591005600001, 1.1],
[1591020000001, 1.2],
[1591092000001, 2.1],
[1591092000002, 2.2],
[1591092000003, 2.3]
]
}
}
EOF
This will index the following datapoints:
2020-06-01 10:00:00.001: 1.1
2020-06-01 14:00:00.001: 1.2
2020-06-02 10:00:00.001: 2.1
2020-06-02 10:00:00.002: 2.2
2020-06-02 10:00:00.003: 2.3
- Query heroic:
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/query/metrics --data-binary @- << EOF
{
"range": {"type": "absolute", "start": 1591012800000, "end": 1591142400000},
"filter": ["and", ["key", "foo"]],
"aggregation": {
"type": "group",
"of": null,
"each": [
{
"type": "count",
"sampling": {
"unit": "seconds",
"value": 86400
}
}
]
}
}
}
EOF
Where
start = 1591012800000, 2020-06-01 12:00:00.000
end = 1591142400000, 2020-06-03 00:00:00.000
Observed result:
1591056000000 = 2.0 (2020-06-02 00:00:00.000)
1591142400000 = 3.0 (2020-06-03 00:00:00.000)
Expected result: Datapoint from 2020-06-01 should be excluded since start range is after it's time
1590969600000 = 1.0 (2020-06-01 00:00:00.000, count of point from 2020-06-01 12:00:00, should exclude 1 point)
1591056000000 = 2.0 (2020-06-02 00:00:00.000, count of point during 2020-06-02)
Please note that day shift bug is separately described in https://github.com/spotify/heroic/issues/664