influxdb
influxdb copied to clipboard
Why is the query so slow after adding the timezone?
In influxdb2.2 OSS:
for example this query is fast: from(bucket: "test") |> range(start: v.timeRangeStart, stop: v.timeRangeStop) |> filter(fn: (r) => r["_measurement"] == "record_raw_data") |> filter(fn: (r) => r["_field"] == "accessTime") |> aggregateWindow(every: 24h, fn: count, createEmpty: false) |> yield(name: "count")
but this query is too slow(More than 300 times slower): import "timezone" option location = timezone.location(name:"Asia/Shanghai") from(bucket: "test") |> range(start: v.timeRangeStart, stop: v.timeRangeStop) |> filter(fn: (r) => r["_measurement"] == "record_raw_data") |> filter(fn: (r) => r["_field"] == "accessTime") |> aggregateWindow(every: 24h, fn: count, createEmpty: false) |> yield(name: "count")
How can I split the query time in the correct timezone in an acceptable time?
me too
@davidby-influx excuse me
me too, For this reason, we are forcing all data to be entered in UTC. Adding timezone makes queries with aggregateWindow tens of times slower
This is a known issue to the Flux team. Thanks for reporting it. You will see it in the release notes when it is fixed.
me too
Me too. When I run my queries without the timezone it gets the data instantly without any delay. If I add the timezone conversion, my query times out.
Same problem here. Is there maybe any update on a fix or a workaround other than not importing "timezone"?
Same problem here. Is there maybe any update on a fix or a workaround other than not importing "timezone"?
I was fed up with waiting but also with this strange flux query language I couldn’t wrap my head around. If I was able to write my query in a different way, hoping to circumvent importing “timezone”, I could continue and be done with it.
But the best advice I can give you: if you’re not too deep in InfluxDB yet, switch to QuestDB. More complex timezone specific queries run now in < 0.5 seconds and I know what query I’m writing because it’s the old, “boring” SQL syntax everyone knows.
What works for me: remove global option location, aggregate to 1h* resolution, then aggregate sum to 1d resolution with timezone:
import "timezone"
from(bucket: "test")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "record_raw_data")
|> filter(fn: (r) => r["_field"] == "accessTime")
|> aggregateWindow(every: 1h, fn: count, createEmpty: false, timeSrc: "_start") // <-- Intermediate results
|> aggregateWindow(every: 24h, fn: sum, createEmpty: false, location: timezone.location(name:"Asia/Shanghai"))
|> yield(name: "count")
(*) 15m/30m for time zones with 30/45-minute offsets