dense_hash_map icon indicating copy to clipboard operation
dense_hash_map copied to clipboard

Benchmarks vs google dens hash map and flat_hash_map

Open aminya opened this issue 5 years ago • 7 comments

Could you add some benchmarks for the library

https://github.com/skarupke/flat_hash_map

aminya avatar Jan 08 '21 05:01 aminya

a good hash map, it's quite same as one of my emhash8(https://github.com/ktprime/emhash/blob/master/hash_table8.hpp) design I have added it into my bench (https://github.com/ktprime/emhash/blob/master/bench/ebench.cpp).

ktprime avatar Feb 13 '22 05:02 ktprime

if index moves from node to bucket, your hashmap will be more efficient(memory and performance)

ktprime avatar Jun 23 '22 10:06 ktprime

Hi @ktprime would you mind expanding a bit more on your idea of moving the index?

Jiwan avatar Jun 23 '22 18:06 Jiwan

I use the following bucket vector as index(metadata)

    struct Bucket
    {
        size_type next;   //next collision index in current bucket vector. 
        size_type index; //index in node vector
    };

.....
Bucket bucket_[];
NodeType node[]; //std::pair<key,value> NodeType;

your current implemention is simple(use integer index/vector to replace chained map's pointer/list) and also efficient, but it's slow for finding miss because of serach/comparation in node vector may produces more cache miss than in bucket vector(small dataset).

ktprime avatar Jun 24 '22 00:06 ktprime

huangyuanbing1@ubuntu:~/map_benchmark/build ``` ./bench_jiwan_dense_hash_map__robin_hood_hash IterateIntegers

"jg::dense_hash_map"; "robin_hood::hash"; "IterateIntegers"; "01"; "iterate while adding"; 20833333325000; 0.63688; 2.08984
"jg::dense_hash_map"; "robin_hood::hash"; "IterateIntegers"; "02"; "iterate while removing"; 62498750000000; 0.636369; 2.08984

huangyuanbing1@ubuntu:~/map_benchmark/build
./bench_ktprime_hash_table8__robin_hood_hash IterateIntegers

"emhash8::HashMap"; "robin_hood::hash"; "IterateIntegers"; "01"; "iterate while adding"; 20833333325000; 0.566379; 1.56641
"emhash8::HashMap"; "robin_hood::hash"; "IterateIntegers"; "02"; "iterate while removing"; 62498750000000; 0.558062; 1.56641

at present the two hashmap(dense hash map) is the fastest iteration implemention compared with other's flat hash map in martinus's benchmark code.

ktprime avatar Jun 24 '22 01:06 ktprime

Right. I am getting what you mean now. By moving the next index into the bucket's structure, you are removing some unnecessary gaps in between the nodes.

I believe that a colleague of mine tried that strategy on our dense_hash_map but couldn't see any significant improvements and we abandoned that idea. But it seems that you have reproducible results which are really interesting. Thanks for pointing it out 👍

I will think of the trade-offs of doing it this way (if there is any of course) and could attempt to improve that hash_map.

Jiwan avatar Jun 24 '22 20:06 Jiwan