rpds icon indicating copy to clipboard operation
rpds copied to clipboard

Question: How to optimize this scenario?

Open HKalbasi opened this issue 8 months ago • 0 comments

I have one writer thread that insert some things in hashmap, and multiple reader threads that get values from it. Instead of using a RwLock with massive contention, I decided to use rpds and arc_swap to pass clones of the map occasionally to the threads, so each reader can read without blocking the writer.

Here is my code:

use std::{
    sync::{Arc, LazyLock},
    time::{Duration, Instant},
};

use arc_swap::ArcSwap;
use rand::Rng;

use tikv_jemallocator::Jemalloc;
#[global_allocator]
static GLOBAL: Jemalloc = Jemalloc;

fn rand_string(n: usize) -> String {
    rand::rng()
        .sample_iter(rand::distr::Alphanumeric)
        .map(|c| c as char)
        .take(n)
        .collect::<String>()
}

use rpds::HashTrieMapSync as PersistantHashMap;

static GLOBAL_MAP: LazyLock<ArcSwap<PersistantHashMap<String, i64>>> =
    LazyLock::new(|| ArcSwap::from_pointee(PersistantHashMap::default()));

fn main() {
    std::thread::spawn(|| {});

    let mut my_map: PersistantHashMap<String, i64> = PersistantHashMap::default();
    let mut cnt = 0;
    let mut last_cnt = 0;
    let mut last_instant = Instant::now();

    for i in 1..=2 {
        std::thread::spawn(move || {
            let mut cnt = 0;
            let mut last_cnt = 0;
            let mut last_instant = Instant::now();
            let mut result = 0;

            loop {
                let key = rand_string(3);
                let load = GLOBAL_MAP.load();
                let value = load.get(&key);
                result += value.copied().unwrap_or(0);
                cnt += 1;
                if cnt % 1024 == 0 {
                    let current_instant = Instant::now();
                    if current_instant - last_instant > Duration::from_secs(1) {
                        println!("get {i} {}   --- {result}", cnt - last_cnt);
                        last_instant = current_instant;
                        last_cnt = cnt;
                    }
                }
            }
        });
    }

    loop {
        let key = rand_string(3);
        let value = rand::rng().random_range(100..1000);
        my_map.insert_mut(key, value);
        cnt += 1;
        if cnt % 1024 == 0 {
            GLOBAL_MAP.store(Arc::new(my_map.clone()));
            let current_instant = Instant::now();
            if current_instant - last_instant > Duration::from_secs(1) {
                println!("insert {}", cnt - last_cnt);
                last_instant = current_instant;
                last_cnt = cnt;
            }
        }
    }
}

Now readers easily scale up without degrading the overall performance, but the writer performance is not great. Even without any reader thread, write rate is ~200_000 insert per second on my machine, but when I remove the GLOBAL_MAP.store line it goes over ~2_000_000, so there is a 10x difference.

Is it possible to improve the writer rate? Maybe by offloading some of the work to the reader threads? Reader rate is ~1_500_000 and I don't need it to be that fast.

HKalbasi avatar Mar 08 '25 23:03 HKalbasi