Benchmarks vs google dens hash map and flat_hash_map
Could you add some benchmarks for the library
https://github.com/skarupke/flat_hash_map
a good hash map, it's quite same as one of my emhash8(https://github.com/ktprime/emhash/blob/master/hash_table8.hpp) design I have added it into my bench (https://github.com/ktprime/emhash/blob/master/bench/ebench.cpp).
if index moves from node to bucket, your hashmap will be more efficient(memory and performance)
Hi @ktprime would you mind expanding a bit more on your idea of moving the index?
I use the following bucket vector as index(metadata)
struct Bucket
{
size_type next; //next collision index in current bucket vector.
size_type index; //index in node vector
};
.....
Bucket bucket_[];
NodeType node[]; //std::pair<key,value> NodeType;
your current implemention is simple(use integer index/vector to replace chained map's pointer/list) and also efficient, but it's slow for finding miss because of serach/comparation in node vector may produces more cache miss than in bucket vector(small dataset).
huangyuanbing1@ubuntu:~/map_benchmark/build ``` ./bench_jiwan_dense_hash_map__robin_hood_hash IterateIntegers
"jg::dense_hash_map"; "robin_hood::hash"; "IterateIntegers"; "01"; "iterate while adding"; 20833333325000; 0.63688; 2.08984
"jg::dense_hash_map"; "robin_hood::hash"; "IterateIntegers"; "02"; "iterate while removing"; 62498750000000; 0.636369; 2.08984
huangyuanbing1@ubuntu:~/map_benchmark/build
./bench_ktprime_hash_table8__robin_hood_hash IterateIntegers
"emhash8::HashMap"; "robin_hood::hash"; "IterateIntegers"; "01"; "iterate while adding"; 20833333325000; 0.566379; 1.56641
"emhash8::HashMap"; "robin_hood::hash"; "IterateIntegers"; "02"; "iterate while removing"; 62498750000000; 0.558062; 1.56641
at present the two hashmap(dense hash map) is the fastest iteration implemention compared with other's flat hash map in martinus's benchmark code.
Right. I am getting what you mean now. By moving the next index into the bucket's structure, you are removing some unnecessary gaps in between the nodes.
I believe that a colleague of mine tried that strategy on our dense_hash_map but couldn't see any significant improvements and we abandoned that idea. But it seems that you have reproducible results which are really interesting. Thanks for pointing it out 👍
I will think of the trade-offs of doing it this way (if there is any of course) and could attempt to improve that hash_map.