aw-server-rust icon indicating copy to clipboard operation
aw-server-rust copied to clipboard

List of possible query2 optimizations

Open johan-bjareholt opened this issue 4 years ago • 4 comments

Processing optimizations:

  • Have the list of events sorted by timestamp, then we could binary search the events intersecting. Doing so would significantly improve performance of transforms like filter_period_intersect and the union_event_split in #179 .

Memory optimizations:

  • Wrap Vec<Event> or whole queries in an allocation arena
  • Reduce the amount of clones in our transforms (already been improved quite a bit the past few months)
  • Add proper lifetimes on variables in query2 to avoid a clone on each variable assignment and reference to it (See https://github.com/ActivityWatch/aw-server-rust/issues/119)

Something else?

johan-bjareholt avatar Jul 08 '20 06:07 johan-bjareholt

@ErikBjare After I benchmarked #183 I realized that memory copying was incredibly fast already and the improvement was still barely noticeable in an extreme benchmark scenario I made up.

I started working on adding tracing instead and got something working quickly, heres the result from a month view (it's running in debug mode though, so significanly slower than usual)

Nov 05 23:31:57.034 TRACE query::function_call::query_bucket: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=256ms
Nov 05 23:31:57.138 TRACE query::function_call::flood: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=103ms
Nov 05 23:31:57.179 TRACE query::function_call::query_bucket: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=4.33ms
Nov 05 23:31:57.181 TRACE query::function_call::flood: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=1.48ms
Nov 05 23:31:57.183 TRACE query::function_call::filter_keyvals: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=1.07ms
Nov 05 23:31:57.730 TRACE query::function_call::filter_period_intersect: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=517ms
Nov 05 23:31:58.503 TRACE query::function_call::categorize: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=677ms
Nov 05 23:31:59.007 TRACE query::function_call::merge_events_by_keys: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=400ms
Nov 05 23:31:59.069 TRACE query::function_call::sort_by_duration: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=14.3ms
Nov 05 23:31:59.121 TRACE query::function_call::merge_events_by_keys: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=39.2ms
Nov 05 23:31:59.128 TRACE query::function_call::sort_by_duration: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=397µs
Nov 05 23:31:59.485 TRACE query::function_call::merge_events_by_keys: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=317ms
Nov 05 23:31:59.531 TRACE query::function_call::sort_by_duration: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=305µs
Nov 05 23:31:59.532 TRACE query::function_call::limit_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=364µs
Nov 05 23:31:59.558 TRACE query::function_call::limit_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=20.2ms
Nov 05 23:31:59.753 TRACE query::function_call::sum_durations: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=144ms
Nov 05 23:31:59.799 TRACE query::function_call::query_bucket: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=698µs
Nov 05 23:31:59.800 TRACE query::function_call::flood: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=150µs
Nov 05 23:32:00.003 TRACE query::function_call::filter_keyvals: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=152ms
Nov 05 23:32:00.062 TRACE query::function_call::filter_period_intersect: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=10.9ms
Nov 05 23:32:00.066 TRACE query::function_call::split_url_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=139µs
Nov 05 23:32:00.066 TRACE query::function_call::concat: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=129µs
Nov 05 23:32:00.066 TRACE query::function_call::sort_by_timestamp: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=113µs
Nov 05 23:32:00.269 TRACE query::function_call::query_bucket: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=203ms
Nov 05 23:32:00.338 TRACE query::function_call::flood: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=68.9ms
Nov 05 23:32:00.532 TRACE query::function_call::filter_keyvals: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=133ms
Nov 05 23:32:07.393 TRACE query::function_call::filter_period_intersect: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=6.78s
Nov 05 23:32:07.740 TRACE query::function_call::split_url_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=261ms
Nov 05 23:32:08.030 TRACE query::function_call::concat: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=181ms
Nov 05 23:32:08.150 TRACE query::function_call::sort_by_timestamp: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=120ms
Nov 05 23:32:08.568 TRACE query::function_call::split_url_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=317ms
Nov 05 23:32:09.043 TRACE query::function_call::merge_events_by_keys: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=323ms
Nov 05 23:32:09.125 TRACE query::function_call::sort_by_duration: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=18.5ms
Nov 05 23:32:09.176 TRACE query::function_call::limit_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=24.5ms
Nov 05 23:32:09.505 TRACE query::function_call::merge_events_by_keys: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=266ms
Nov 05 23:32:09.565 TRACE query::function_call::sort_by_duration: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=2.17ms
Nov 05 23:32:09.570 TRACE query::function_call::limit_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=2.04ms
Nov 05 23:32:09.788 TRACE query::function_call::sum_durations: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=175ms
Nov 05 23:32:10.049 DEBUG interpret: aw_query::interpret: close time.busy=0.00ns time.idle=13.3s

There is one huge outlier here and that's one call to filter_period_intersect which takes 6.78 seconds out of 13.3 seconds total.

The previous filter_period_intersects are much faster so I assume that this is when we do an intersection between the window events and the browser events as that's when we have the most amount of events.

EDIT: Tried running the tracing with my refcounting too, the same query took 13.0 seconds instead of 13.3 seconds so it's probably not worth it considering the increased complexity.

johan-bjareholt avatar Nov 05 '20 22:11 johan-bjareholt

Apparently the calculate_endtime() function is incredibly slow, I was able to do a 5x optimization by caching the endtime for the filter events in each loop. Will push a PR soon.

johan-bjareholt avatar Nov 07 '20 14:11 johan-bjareholt

Same query on exact same data in the web-ui in the above example went from 13.3 seconds to 7.53 seconds where the big filter_period_intersect went from 6.78 seconds to 1.24 seconds.

I'd consider that a huge win, changes are in #184

johan-bjareholt avatar Nov 07 '20 15:11 johan-bjareholt

Wrapping our event lists in allocation arenas should significantly improve memory usage, would probably completely fix the "leak-like" behavior of the allocator keeping a lot more memory after big queries than it needs to.

Exactly how to do this in a nice manner in Rust though I have to investigate a bit more.

johan-bjareholt avatar Feb 13 '21 21:02 johan-bjareholt