rpds
rpds copied to clipboard
Question: How to optimize this scenario?
I have one writer thread that insert some things in hashmap, and multiple reader threads that get values from it. Instead of using a RwLock with massive contention, I decided to use rpds and arc_swap to pass clones of the map occasionally to the threads, so each reader can read without blocking the writer.
Here is my code:
use std::{
sync::{Arc, LazyLock},
time::{Duration, Instant},
};
use arc_swap::ArcSwap;
use rand::Rng;
use tikv_jemallocator::Jemalloc;
#[global_allocator]
static GLOBAL: Jemalloc = Jemalloc;
fn rand_string(n: usize) -> String {
rand::rng()
.sample_iter(rand::distr::Alphanumeric)
.map(|c| c as char)
.take(n)
.collect::<String>()
}
use rpds::HashTrieMapSync as PersistantHashMap;
static GLOBAL_MAP: LazyLock<ArcSwap<PersistantHashMap<String, i64>>> =
LazyLock::new(|| ArcSwap::from_pointee(PersistantHashMap::default()));
fn main() {
std::thread::spawn(|| {});
let mut my_map: PersistantHashMap<String, i64> = PersistantHashMap::default();
let mut cnt = 0;
let mut last_cnt = 0;
let mut last_instant = Instant::now();
for i in 1..=2 {
std::thread::spawn(move || {
let mut cnt = 0;
let mut last_cnt = 0;
let mut last_instant = Instant::now();
let mut result = 0;
loop {
let key = rand_string(3);
let load = GLOBAL_MAP.load();
let value = load.get(&key);
result += value.copied().unwrap_or(0);
cnt += 1;
if cnt % 1024 == 0 {
let current_instant = Instant::now();
if current_instant - last_instant > Duration::from_secs(1) {
println!("get {i} {} --- {result}", cnt - last_cnt);
last_instant = current_instant;
last_cnt = cnt;
}
}
}
});
}
loop {
let key = rand_string(3);
let value = rand::rng().random_range(100..1000);
my_map.insert_mut(key, value);
cnt += 1;
if cnt % 1024 == 0 {
GLOBAL_MAP.store(Arc::new(my_map.clone()));
let current_instant = Instant::now();
if current_instant - last_instant > Duration::from_secs(1) {
println!("insert {}", cnt - last_cnt);
last_instant = current_instant;
last_cnt = cnt;
}
}
}
}
Now readers easily scale up without degrading the overall performance, but the writer performance is not great. Even without any reader thread, write rate is ~200_000 insert per second on my machine, but when I remove the GLOBAL_MAP.store line it goes over ~2_000_000, so there is a 10x difference.
Is it possible to improve the writer rate? Maybe by offloading some of the work to the reader threads? Reader rate is ~1_500_000 and I don't need it to be that fast.