flux icon indicating copy to clipboard operation
flux copied to clipboard

(Expected?) Slower basic queries compared to InfluxQL

Open zarbis opened this issue 6 years ago • 4 comments

I have identical basic queries that do not utilize new Flux capabilities and produce same result from end user's standpoint, both in Flux and InfluxQL. And they have drastically different execution time:

1.28s

SELECT sum("value") AS "sum_value" FROM "analytics"."autogen"."dmg"
WHERE time > now()-7d AND
      "action"='foo' AND
      "domain"='bar' AND
      "type"='baz'
GROUP BY time(1d), segment, block, block_type FILL(null)

68s

actions = from(bucket: "analytics")
|> range(start: -7d)
|> filter(fn: (r) => 
  r._field == "value" and 
  r._measurement == "dmg" and 
  r.action == "foo" and 
  r.domain == "bar" and 
  r.type == "baz")
|> group(columns: ["segment", "block", "block_type"])
|> aggregateWindow(every: 1d, fn: sum)
|> yield()

There is total of 6M points in time interval and execution time is ~50 times slower for Flux. Am I doing something wrong in my Flux query or this is confirmed performance issue? Maybe fixed in latest versions?

I'm using Flux 0.24 bundled with InfluxDB 1.7.7.

zarbis avatar Jul 23 '19 10:07 zarbis

Yes it happens, and if you try to do a join over two measurements it will be a bigtime performance issue. Try for fun and post your findings here.

saiyam1814 avatar Jul 24 '19 11:07 saiyam1814

@zarbis Thanks for the detailed report. This is not unexpected we are still working out some known performance issues. My guess is that you are running into this issue specifically #1243

We will investigate

nathanielc avatar Jul 24 '19 19:07 nathanielc

I'm glad I'm not the only one seeing severe performance differences between InfluxQL and Flux. I have a relatively small DB on InfluxDB 2.0 alpha18, which has no trouble ingesting around 1000 data points per second, and is also very responsive when executing aggregating queries that contain a substantially filtered input.

However, if want to produce a combined sum of all points in a measurement with an aggregateWindow, it is extremely slow and usually runs out of memory - despite having about 100 times more RAM than the actual size of the DB.

I have used InfluxDB for over 5 years (even pre-1.x versions), and this kind of poor performance on relatively simple "map-reduce"-style queries really surprised me.

dswarbrick avatar Oct 21 '19 09:10 dswarbrick

is it still an issue on latest builds of InfluxDB v2 ?

wnasich avatar Jul 22 '20 04:07 wnasich

This issue has had no recent activity and will be closed soon.

github-actions[bot] avatar Jun 24 '24 01:06 github-actions[bot]