flurry
flurry copied to clipboard
Support rayon parallel iterators
We should implement rayon
parallel iterators. The notes and code in iter/plumbing
may be helpful, and the hashbrown
implementation too.
I'm happy to answer rayon questions on this.
@cuviper Since you're offering, one thing I was actually wondering was whether we can somehow take advantage of the fact that the map supports fully concurrent access. Since non-concurrent maps like hashbrown
are also able to implement the parallel traits, it wasn't immediately clear to me whether how you get a win from flurry
in rayon
even though it really feels like it should be possible.
I suspect maybe the answer has to do with some of the traits in iter/plumbing
, but not sure?
Concurrent insert
will make a naive ParallelExtend
and FromParallelIterator
trivial, just something like par_iter.for_each(|(key, value)| { map.insert(key, value); })
. Maybe there's a more advanced approach that could improve performance, like folding into separate bins and then reducing into the final map, but I don't know your data structure enough to evaluate that.
Parallel iterators are a bit harder -- you need a strategy for splitting the map into separate "slices"/"views" of some sort. The hashbrown implementation should be a good reference if you look at how they use RawIterRange::split
internally.
Ah, I see, hashbrown
is forced for first reduce and then use one core to build the map, whereas we don't have to do that. Neat!
I would like to give this a try.
All yours!
I need some help. Whats the best way to split a Table into two halves?
The best way is probably to just split the list of bins in two: the "high" bins and the "low" bins.
I have no idea how to approach this. I think it's better if I let someone else do it.