Is there a concurrent_hash_set container ?
Hi,
I see there is a concurrent_hash_map container but no concurrent_hash_set container in TBB. Is it possible to have concurrent_hash_set container ?
The concurrent_unordered_set seems to have poor performance in my application in the clear() operation.
Thanks,
We do not have concurrent_hash_set in oneTBB but you still can use concurrent_hash_map and just ignore the value (use something like bool to save some memory).
@kboyarinov what do you think is it possible to extend concurrent_hash_map to support the set behavior?
@pavelkumbrasev @fdiedler
I think in theory it is possible to add a hash set container on top of concurrent_hash_map engine. Could you please explain your use case for such case as well as the performance issues with concurrent_unordered_set::clear to proper understand the motivation for such an extension?
Thanks in advance
@kboyarinov @pavelkumbrasev
Let's take this simple example (I omitted the includes for readability) :
struct Foo
{
std::bitset<512> field1;
int field2;
struct Equality {
inline bool operator()(const Foo& a, const Foo& b) const
{
return (a.field1 == b.field1 && a.field2 == b.field2);
};
};
struct Hash {
inline size_t operator()(const Foo& n) const
{
return (n.field2 ^ std::hash<std::bitset<512>>()(n.field1));
};
};
};
// Example from https://www.intel.com/content/www/us/en/docs/onetbb/developer-guide-api-reference/2021-12/concurrent-hash-map.html
struct MyHashCompare {
static size_t hash( const Foo& n ) {
return (n.field2 ^ std::hash<std::bitset<512>>()(n.field1));
}
static bool equal( const Foo& a, const Foo& b ) {
return (a.field1 == b.field1 && a.field2 == b.field2);
}
};
int main()
{
using duration_t = std::chrono::duration<int, std::milli>;
std::chrono::time_point<std::chrono::high_resolution_clock> start, end;
//tbb::concurrent_unordered_set<Foo, Foo::Hash, Foo::Equality> unorderedSet;
tbb::concurrent_hash_map<Foo, Foo, MyHashCompare> unorderedSet;
start = std::chrono::high_resolution_clock::now();
for (size_t i=0; i<10000000; ++i)
{
Foo tmp;
tmp.field1[i % 512] = 1;
tmp.field2 = i;
unorderedSet.insert({tmp, tmp}); // for concurrent_unordered_set : unorderedSet.insert(tmp);
}
end = std::chrono::high_resolution_clock::now();
std::cout << "Size = " << unorderedSet.size() << " add in " << std::chrono::duration_cast<duration_t>(end - start).count() << " ms " << std::endl;
start = std::chrono::high_resolution_clock::now();
unorderedSet.clear();
end = std::chrono::high_resolution_clock::now();
std::cout << "Time to clear = " << std::chrono::duration_cast<duration_t>(end - start).count() << " ms";
return 0;
}
The output on my computer (Intel(R) Core(TM) i7-9750H CPU / Windows 10) :
Size = 10000000 add in 4661 ms Time to clear = 1942 ms
If I use a tbb::concurrent_unordered_set instead of the tbb::concurrent_hash_map
Size = 10000000 add in 6649 ms Time to clear = 2967 ms
The Hash map clear function is really faster (note that adding elements is also faster with Hash map). But I don't need the {Key -> Value} and that is why it would benice to have a tbb;;concurrent_hash_set with better performance than tbb::concurrent_unordered_set.
Thanks,