oneTBB icon indicating copy to clipboard operation
oneTBB copied to clipboard

Is there a concurrent_hash_set container ?

Open fdiedler opened this issue 1 year ago • 3 comments

Hi,

I see there is a concurrent_hash_map container but no concurrent_hash_set container in TBB. Is it possible to have concurrent_hash_set container ?

The concurrent_unordered_set seems to have poor performance in my application in the clear() operation.

Thanks,

fdiedler avatar Dec 12 '24 15:12 fdiedler

We do not have concurrent_hash_set in oneTBB but you still can use concurrent_hash_map and just ignore the value (use something like bool to save some memory). @kboyarinov what do you think is it possible to extend concurrent_hash_map to support the set behavior?

pavelkumbrasev avatar Jan 02 '25 10:01 pavelkumbrasev

@pavelkumbrasev @fdiedler I think in theory it is possible to add a hash set container on top of concurrent_hash_map engine. Could you please explain your use case for such case as well as the performance issues with concurrent_unordered_set::clear to proper understand the motivation for such an extension? Thanks in advance

kboyarinov avatar Jan 06 '25 12:01 kboyarinov

@kboyarinov @pavelkumbrasev

Let's take this simple example (I omitted the includes for readability) :

struct Foo
{
    std::bitset<512> field1;
    int field2;

    struct Equality {
        inline bool operator()(const Foo& a, const Foo& b) const
        {
            return (a.field1 == b.field1 && a.field2 == b.field2);
        };
    };

    struct Hash {
        inline size_t operator()(const Foo& n) const
        {
            return (n.field2 ^ std::hash<std::bitset<512>>()(n.field1));
        };
    };
};

// Example from https://www.intel.com/content/www/us/en/docs/onetbb/developer-guide-api-reference/2021-12/concurrent-hash-map.html
struct MyHashCompare {
    static size_t hash( const Foo& n ) {
        return (n.field2 ^ std::hash<std::bitset<512>>()(n.field1));
    }
    static bool equal( const Foo& a, const Foo& b ) {
        return (a.field1 == b.field1 && a.field2 == b.field2);
    }
};

int main()
{
    using duration_t = std::chrono::duration<int, std::milli>;
    std::chrono::time_point<std::chrono::high_resolution_clock> start, end;
    //tbb::concurrent_unordered_set<Foo, Foo::Hash, Foo::Equality> unorderedSet;
    tbb::concurrent_hash_map<Foo, Foo, MyHashCompare> unorderedSet;

    start = std::chrono::high_resolution_clock::now();
    for (size_t i=0; i<10000000; ++i)
    {
        Foo tmp;
        tmp.field1[i % 512] = 1;
        tmp.field2 = i;
        unorderedSet.insert({tmp, tmp}); // for concurrent_unordered_set : unorderedSet.insert(tmp);
    }
    end = std::chrono::high_resolution_clock::now();

    std::cout << "Size = " << unorderedSet.size() << " add in " << std::chrono::duration_cast<duration_t>(end - start).count() << " ms " << std::endl;

    start = std::chrono::high_resolution_clock::now();
    unorderedSet.clear();
    end = std::chrono::high_resolution_clock::now();

    std::cout << "Time to clear = " << std::chrono::duration_cast<duration_t>(end - start).count() << " ms";

    return 0;
}

The output on my computer (Intel(R) Core(TM) i7-9750H CPU / Windows 10) :

Size = 10000000 add in 4661 ms Time to clear = 1942 ms

If I use a tbb::concurrent_unordered_set instead of the tbb::concurrent_hash_map

Size = 10000000 add in 6649 ms Time to clear = 2967 ms

The Hash map clear function is really faster (note that adding elements is also faster with Hash map). But I don't need the {Key -> Value} and that is why it would benice to have a tbb;;concurrent_hash_set with better performance than tbb::concurrent_unordered_set.

Thanks,

fdiedler avatar Jan 08 '25 08:01 fdiedler