influxdb icon indicating copy to clipboard operation
influxdb copied to clipboard

Why is the query so slow after adding the timezone?

Open liuguozhi4628 opened this issue 3 years ago • 11 comments

In influxdb2.2 OSS:

for example this query is fast: from(bucket: "test") |> range(start: v.timeRangeStart, stop: v.timeRangeStop) |> filter(fn: (r) => r["_measurement"] == "record_raw_data") |> filter(fn: (r) => r["_field"] == "accessTime") |> aggregateWindow(every: 24h, fn: count, createEmpty: false) |> yield(name: "count")

but this query is too slow(More than 300 times slower): import "timezone" option location = timezone.location(name:"Asia/Shanghai") from(bucket: "test") |> range(start: v.timeRangeStart, stop: v.timeRangeStop) |> filter(fn: (r) => r["_measurement"] == "record_raw_data") |> filter(fn: (r) => r["_field"] == "accessTime") |> aggregateWindow(every: 24h, fn: count, createEmpty: false) |> yield(name: "count")

How can I split the query time in the correct timezone in an acceptable time?

liuguozhi4628 avatar Jun 16 '22 02:06 liuguozhi4628

me too

oliver225 avatar Jul 02 '22 08:07 oliver225

@davidby-influx excuse me

oliver225 avatar Jul 02 '22 08:07 oliver225

me too, For this reason, we are forcing all data to be entered in UTC. Adding timezone makes queries with aggregateWindow tens of times slower

rioald avatar Aug 19 '22 14:08 rioald

This is a known issue to the Flux team. Thanks for reporting it. You will see it in the release notes when it is fixed.

davidby-influx avatar Aug 19 '22 16:08 davidby-influx

me too

zwl-sun avatar Aug 28 '22 02:08 zwl-sun

Me too. When I run my queries without the timezone it gets the data instantly without any delay. If I add the timezone conversion, my query times out.

MS699 avatar Jan 04 '23 08:01 MS699

Same problem here. Is there maybe any update on a fix or a workaround other than not importing "timezone"?

boredomwontgetus avatar Apr 16 '23 07:04 boredomwontgetus

Same problem here. Is there maybe any update on a fix or a workaround other than not importing "timezone"?

I was fed up with waiting but also with this strange flux query language I couldn’t wrap my head around. If I was able to write my query in a different way, hoping to circumvent importing “timezone”, I could continue and be done with it.

But the best advice I can give you: if you’re not too deep in InfluxDB yet, switch to QuestDB. More complex timezone specific queries run now in < 0.5 seconds and I know what query I’m writing because it’s the old, “boring” SQL syntax everyone knows.

SanderLedegen avatar Apr 16 '23 10:04 SanderLedegen

What works for me: remove global option location, aggregate to 1h* resolution, then aggregate sum to 1d resolution with timezone:

import "timezone"

from(bucket: "test")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "record_raw_data")
|> filter(fn: (r) => r["_field"] == "accessTime")

|> aggregateWindow(every: 1h, fn: count, createEmpty: false, timeSrc: "_start") // <-- Intermediate results

|> aggregateWindow(every: 24h, fn: sum, createEmpty: false, location: timezone.location(name:"Asia/Shanghai"))
|> yield(name: "count")

(*) 15m/30m for time zones with 30/45-minute offsets

nescafe2002 avatar Apr 29 '23 09:04 nescafe2002