fxhash
fxhash copied to clipboard
Slow deserialization compared to stock `HashMap`
I tried using FxHashMap
to improve performance of internal hash maps, but I noticed that it takes several times longer to deserialize a large hash map that we need to persist. I couldn't find anything about it on google, so I created a minimal example and am reporting it here.
The behavior can be reproduced using a simple set of random elements. For example, on my machine this code takes 0.08s to serialize and 1.2s to deserialize a HashSet
, as reported by cargo run --release
:
use std::collections::HashSet;
use std::time::Instant;
use rand::Rng;
fn main() {
let mut rnd = rand::thread_rng();
let h: HashSet<u64> = (0..10_000_000).map(|_| rnd.gen::<u64>()).collect();
let t0 = Instant::now();
let mut out = vec![];
bincode::serialize_into(&mut out, &h).unwrap();
let t1 = Instant::now();
println!("serialize: {}", (t1 - t0).as_secs_f64());
let h2: HashSet<u64> = bincode::deserialize_from(&*out).unwrap();
let t2 = Instant::now();
println!("deserialize: {}", (t2 - t1).as_secs_f64());
println!("{}", h2.len());
}
Trivially changing HashSet
to fxhash::FxHashSet
increases deserialization time to 5.95s (almost 5x slower), while serialization is unchanged. In our actual use case the original deserialization takes on the order of 2 minutes, so the slowdown visibly affects our total processing times. Code:
use fxhash::FxHashSet;
use std::time::Instant;
use rand::Rng;
fn main() {
let mut rnd = rand::thread_rng();
let h: FxHashSet<u64> = (0..10_000_000).map(|_| rnd.gen::<u64>()).collect();
let t0 = Instant::now();
let mut out = vec![];
bincode::serialize_into(&mut out, &h).unwrap();
let t1 = Instant::now();
println!("serialize: {}", (t1 - t0).as_secs_f64());
let h2: FxHashSet<u64> = bincode::deserialize_from(&*out).unwrap();
let t2 = Instant::now();
println!("deserialize: {}", (t2 - t1).as_secs_f64());
println!("{}", h2.len());
}
I thought that perhaps the issue is in pathological behavior when rebuilding a hashmap from elements extracted in the same order, so I modified the code to serialize a Vec
and deserialize by building an FxHashSet
out of the Vec
. That results in slightly slower serialization of 0.12s, but deserializes in just 0.15s, which includes both the time to deserialize the vector and the time to collect it into a new FxHashSet
. (Applying this to ordinary HashSet
didn't speed it up, it takes 0.11s to serialize and 1.3s to deserialize.) Code:
use fxhash::FxHashSet;
use std::time::Instant;
use rand::Rng;
fn main() {
let mut rnd = rand::thread_rng();
let h: FxHashSet<u64> = (0..10_000_000).map(|_| rnd.gen::<u64>()).collect();
let t0 = Instant::now();
let mut out = vec![];
let hack = h.iter().copied().collect::<Vec<u64>>();
bincode::serialize_into(&mut out, &hack).unwrap();
let t1 = Instant::now();
println!("serialize: {}", (t1 - t0).as_secs_f64());
let hack2: Vec<u64> = bincode::deserialize_from(&*out).unwrap();
let h2: FxHashSet<u64> = hack2.into_iter().collect();
let t2 = Instant::now();
println!("deserialize: {}", (t2 - t1).as_secs_f64());
println!("{}", h2.len());
}
Is this expected behavior for an FxHashMap
? Is there a way to fix it without going through a custom (and space-inefficieent) serialization/deserialization?
It turns out that the issue can be reproduced without serde
and boils down to the performance of growing the hashmap from scratch. This code builds the 10m-element std HashSet
in 1.1s:
use std::collections::HashSet;
use std::time::Instant;
use rand::Rng;
fn main() {
let mut rnd = rand::thread_rng();
let h: HashSet<u64> = (0..10_000_000).map(|_| rnd.gen::<u64>()).collect();
let t0 = Instant::now();
let mut h2 = HashSet::<u64>::default();
for &n in h.iter() {
h2.insert(n);
}
let t1 = Instant::now();
println!("rebuild: {}", (t1 - t0).as_secs_f64());
println!("{}", h2.len());
}
But building a FxHashSet
requires 5.4s:
use fxhash::FxHashSet;
use std::time::Instant;
use rand::Rng;
fn main() {
let mut rnd = rand::thread_rng();
let h: FxHashSet<u64> = (0..10_000_000).map(|_| rnd.gen::<u64>()).collect();
let t0 = Instant::now();
let mut h2 = FxHashSet::<u64>::default();
for &n in h.iter() {
h2.insert(n);
}
let t1 = Instant::now();
println!("rebuild: {}", (t1 - t0).as_secs_f64());
println!("{}", h2.len());
}
The slowdown only happens if the hashmap is fed the keys in the order provided by the hashmap built using the same hash function, reminiscent of this bug. If the original map is std HashMap
and the newly built one FxHashMap
, the rebuild time is 0.9s, faster than the original.
The issue got exposed by serde because serde is capping the size hint to a very conservative 4096 elements.
It's not obvious that the underlying issue is fixable in this crate, but maybe the documentation should warn of the performance degradation when rebuilding from an existing hash table, particularly when deserializing with serde.