map_benchmark icon indicating copy to clipboard operation
map_benchmark copied to clipboard

Some notes for the big update

Open martinus opened this issue 3 years ago • 4 comments
trafficstars

  • don't split up benchmark results by hash
  • Maybe split up into open address hashing and chained hashing y or node based
    • Split up into 2 categories: open address hashing (boost maps, std::unordered_map), and all of them.
  • Having a filter for the results would be nice
  • Disable zoom? at least make a wider view
  • Add one summary page with the geomean of all find & insert benchmarks (except the ctor benchmarks)
  • Add a conclusio page:
    • Use a reasonable hash that spreads entrophy in upper bits to lower bits. std::hash or boost::hash's identity was and is a bad idea. Doesn't need to withstand randomness tests, mumx seems to be good enough)
    • Use a pool allocator (boost or PoolAllocator). It's faster and uses much less RAM.
  • Create a sortable table: X axis benchmark, y axis map & hash. With one entry that's the GEOMEAN.
    • See https://cpu.userbenchmark.com/
    • See https://betterprogramming.pub/sort-and-filter-dynamic-data-in-table-with-javascript-e7a1d2025e3c

martinus avatar Jun 17 '22 18:06 martinus

most test case is only one hash map. pls add a new benchmark like this (many small hash maps)

template<class hash_type>
void multi_small_ife(const std::string& hash_name, const std::vector<keyType>& vList)
{
#if KEY_INT
    size_t sum = 0;
    const auto hash_size = vList.size() / 1003 + 10;
    const auto ts1 = getus();

	auto mh = new hash_type[hash_size];
	for (const auto& v : vList) {
		auto hash_id = ((uint32_t)v) % hash_size;
		sum += mh[hash_id].emplace(v, 0).second;
	}

	for (const auto& v : vList) {
		auto hash_id = ((uint32_t)v) % hash_size;
		sum += mh[hash_id].count(v);
	}

	for (const auto& v : vList) {
		auto hash_id = ((uint32_t)v) % hash_size;
		sum += mh[hash_id].erase(v + v % 2);
	}

	delete []mh;

#endif
}

ktprime avatar Jun 18 '22 01:06 ktprime

@ktprime I don't see what that benchmarks adds that is not already covered by the other benchmarks?

martinus avatar Jun 20 '22 09:06 martinus

the code is copyed from my bench https://github.com/ktprime/emhash/blob/master/bench/ebench.cpp

ktprime avatar Jun 20 '22 10:06 ktprime

I find bench code, key and value(integer case) is alway same type (int-> int, size_t-> size, uint64_t->uint64_t) can u add or modify some case with <key, value> pairs like <uint64_t, int32_t>?

ktprime avatar Jul 05 '22 06:07 ktprime