go-datastore
go-datastore copied to clipboard
Improve performance of Query method
In ipfs/go-ipfs#2760 @whyrusleeping said in a line comment:
Yeah, using a channel as an iterator sucks. If one of you wants to work on improving the perf of query that would be great.
We could change the interface to not use a channel, and have it instead just return the next value directly. Then on top of that we could provide a method for turning the direct query result into a channel buffered one for usecases that need it
@whyrusleeping I will be happy to look into this and determine where the bottleneck is. It may be as simple as increasing the buffer size. I will also try a direct iterator approach and see if that helps.
Here are some performance numbers for doing a key-only query on the leveldb datastore:

The buffer size is the channel buffer size, direct is the results from querying the level-db directly.
And here are some results from the flatfs datastore:

It seams that at least for key-only 128 in the optimal buffer size.
@kevina thanks for these graphs, i think youre right, we should buffer the channels at 128 for now. And if we need more perf later, give the option for direct iteration.
I updated the graph for flatfs queries. It seams there is enough overhead in the filepath.Walk that once the buffer is large enough the overhead of channels and goroutine is insignificant.
I pushed the (somewhat hackish) code to create the graphs on the kevina/query-benchmarks for lack of a better place.