hashbrown
hashbrown copied to clipboard
rayon parallel iterators execute serially
I tried to use a HashMap::par_iter()
and was surprised to see that it does not execute in parallel. Is this something that can be fixed, or documented somewhere? As it stands, I'm not sure why I would ever use par_iter over par_bridge if I wanted parallel iteration.
My repro case:
let container: Vec<i32> = (0..5).into_iter().collect();
container.par_iter().for_each(|_| {
let _span = tracy_client::span!("vec");
std::thread::sleep_ms(10);
});
let container: HashSet<i32> = (0..5).into_iter().collect();
container.par_iter().for_each(|_| {
let _span = tracy_client::span!("hashbrown");
std::thread::sleep_ms(10);
});
container.iter().par_bridge().for_each(|_| {
let _span = tracy_client::span!("hashbrown par_bridge");
std::thread::sleep_ms(10);
});
This is on 0.13.1 with the rayon
feature flag enabled.
Due to the structure of the table, the parallel iterator won't split groups of 16 elements across threads. If you increase the number of elements in the table then you will see parallel execution.
Interesting, is the granularity always 16 consecutive elements in iteration order?
Is this a hard limitation, or a consequence of the current implementation? It would be great if it can be made more granular, unless there are trade offs I'm not aware of.
The granularity comes from the group width which is 16 elements on x86 because that is the number of bytes in a 128-bit SSE regsiter. On other platforms is it 8 elements.
I was affected by this issue, and ended up doing stuff like this to fix it
if map.len() < THRESHOLD {
Either::Left(map.iter().par_bridge())
} else {
Either::Right(map.par_iter())
}
Please consider having something similar built into the IntoParallelIterator
implementation.