dashmap icon indicating copy to clipboard operation
dashmap copied to clipboard

`capacity` behavior does not align with its documentation

Open zacknewman opened this issue 1 year ago • 4 comments

Either there is a bug in code, or the documentation needs to be changed for capacity. The documentation states (emphasis added):

Returns how many key-value pairs the map can store without reallocating.

The following code illustrates a reallocation occurs before inserting more than capacity key-value pairs.

use dashmap::DashMap;
fn main() {
    let map = DashMap::with_capacity(192);
    let cap = map.capacity();
    for i in 0..cap {
        map.insert(i, ());
    }
    assert_eq!(cap, map.len());
    assert_ne!(cap, map.capacity());
}
[zack@laptop src]$ uname -a
Linux laptop 6.10.10-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 12 Sep 2024 17:21:02 +0000 x86_64 GNU/Linux
[zack@laptop src]$ cargo -V
cargo 1.81.0 (2dbb1af80 2024-08-20)

Unsurprisingly, the same problem exists for DashSet.

zacknewman avatar Sep 30 '24 22:09 zacknewman

I can sporadically trigger a reallocation in as few as 25 inserts (i.e., after ≈ 13% of the originally reported capacity—which is 192 on my machine—is added).

zacknewman avatar Sep 30 '24 22:09 zacknewman

Is it possible to calculate a lower bound like HashMap::capacity?

Returns the number of elements the map can hold without reallocating.

This number is a lower bound; the HashMap<K, V> might be able to hold more, but is guaranteed to be able to hold at least this many.

zacknewman avatar Sep 30 '24 22:09 zacknewman

iirc many hashmaps have a fill-factor for reallocating, which is basically the fraction of slots to be occupied before reallocating (unlike Vec, which only reallocates once its underlying allocation is full). I would image capacity here means the theoretically maximum number of non-colliding members whereas you would expect it to be fill_factor * num_entry_slots?

I am not well versed with dashmap, so this is really just a thought on why this might be the case.

cactusdualcore avatar Feb 03 '25 12:02 cactusdualcore

I'm aware of what fill-factor is. The point of this issue is that the documentation is not correct, and the ideal fix would be to change the implementation such that it returns a lower bound on the number of items that can be inserted without causing a reallocation à la HashMap. Without this, there is no way one can ensure a DashMap does not grow past a certain threshold that would cause a re-allocation forcing one to use something like a HashMap instead.

zacknewman avatar Feb 03 '25 16:02 zacknewman