frostdb
frostdb copied to clipboard
#88 execute queries in parallel
https://github.com/polarsignals/arcticdb/issues/88#issuecomment-1148997165
This PR adds parallelism in the table iterator. After the rowgroups are collected, a number of goroutines are spawned to process the rowgroups in parallel by calling the passed iterator (e.g. the physical plan).
It might be the case that the iterator keeps internal state in a way that is not thread-safe (e.g. physical plan's HashAggregate). In this case, it makes sense for the iterator to be created per thread (i.e. one instance of physical plan per thread). To accommodate this, the interface to table iterator has changed to accept an interface IteratorProvider
with a method Iterator()
that generates the iterator.
For aggregation queries, the aggregations happen in parallel and a Finisher has been added to combine the results of the multiple aggregations.
It's also possible that the final query callback could now be called concurrently. A new parameter is added to the query builder called Options which has a field to control this behaviour.
query.NewEngine(memory.DefaultAllocator, db.TableProvider()).
ScanTable("test_table").
// NEW:
Options(&query.Options{SyncCallback: false}) // default is 'true'
Execute(context.Background(), func(ar arrow.Record) error {
// controls if this function can execute is synchronized or concurrent
hey @metalmatze , following up from your comments on #88. I have this as draft because I had a few outstanding work items:
- The current code can call the query callback in parallel, which could lead to unexpected behaviour if the callback isn't written to be threadsafe. I was going to try to add some synchronization around that
- fixing the issues mentioned in code review :)
I'll get those cleaned up asap and move this out of draft status
@brancz the changes discussed in the comments above have been made. Would you mind reviewing again?
Closing this since https://github.com/polarsignals/frostdb/pull/202 has merged. Thank you @albertlockett for your contributions! All your initial work helped to inform the final design!
Very cool thanks nice work! @thorfour @metalmatze