cuCollections
cuCollections copied to clipboard
[FEA] Hash map design that supports reduce by key
Is your feature request related to a problem? Please describe.
I would like to be able to implement a hash-based reduce-by-key operation. A thrust::reduce_by_key
operation requires first doing a sort_by_key
. This means 2 passes through the input data and 2n
memory overhead. A hash based implementation requires only a single pass and n/0.7
memory overhead (assuming a 70% load factor).
Trying to implement a reduce-by-key with today's static_map
implementation is effectively impossible.
There's two ways you could implement a hash-based rbk (assume the reduction is sum
).
The first assumes that insert
returns an iterator to the slot where the inserted occurred, or the slot that contains an equivalent key:
template <typename Map, typename Key, typename Value>
hash_rbk(Map m, Key key_begin, Key key_end, Value value_begin){
auto found = map.insert(key_begin + tid, value_begin + tid); // assume `insert` returns an iterator to the slot where the insert occurred
// key already exists, accumulate with the existing value
if(found.second){
auto& found_pair = (found.first)
found_pair.second.fetch_add(value_begin + tid);
}
}
The second assumes you can do concurrent insert and find:
template <typename Map, typename Key, typename Value>
hash_rbk(Map m, Key key_begin, Key key_end, Value value_begin){
auto found = map.find(key_begin + tid); // look if the key exists
// key already exists, accumulate with the existing value
if(found != map.end()){
found.second.fetch_add(value_begin + tid);
} else { // key doesn't exist
auto found = map.insert(key_begin + tid, value_begin + tid);
if(not found.second){ // someone else already did the insert, accumulate with their value
found.first.second.fetch_add(value_begin+tid);
}
}
}
However, both of these are impossible because:
- The
insert
function cannot return an iterator (due to the potential for a race condition caused by the back-to-back CAS algorithm), and instead returns abool
.- This results from the fact that we use
pair<atomic<K>, atomic<V>>
as the slot storage type and use back to back CAS. If we instead usedatomic<pair<K,V>>
we could return an iterator
- This results from the fact that we use
- You cannot do concurrent insert/find operations.
- This results from the fact that we use
pair<atomic<K>, atomic<V>>
as the slot storage type and use back to back CAS. If we instead usedatomic<pair<K,V>>
we could support concurrent insert and find.
- This results from the fact that we use
Describe the solution you'd like
Explore a hash map implementation that allows implementing a reduce by key.
One option is to use a atomic<pair<K,V>>
implementation, but that has it's own drawbacks. See https://docs.google.com/presentation/d/1_UJlQoDc985sj03grMroB2zcCbiiC7-ua_1Eqm4qrRU/edit?usp=sharing