cuCollections icon indicating copy to clipboard operation
cuCollections copied to clipboard

[FEA] Refactor of open address data structures

Open jrhemstad opened this issue 2 years ago • 8 comments

Is your feature request related to a problem? Please describe.

There is a significant amount of redundancy among the static_map/static_multimap/static_reduction_map classes. This is a large maintenance overhead and means optimizations made to one data structure do not translate to the others.

Furthermore, there are several configuration options we'd like to enable, like using AoS vs SOA, scalar vs. CG operations, etc.

I'd also like to enable adding a static_set and static_multiset classes that could share the same backend.

Describe the solution you'd like

List of things I'd like to address:

  • Use atomic_ref when possible (4B/8B key/value types) #183
  • Falls back on atomic when necessary (<4B, >8B key/value types)
    • For <4B, we should probably just widen them ourselves and still use atomic_ref instead of atomic.
  • Eliminates redundancy among static_map/reduction_map/multimap
  • Enables AoS vs SoA layout (#103)
  • Enables statically sized device views
    • We should use a pattern like std::span with std::dynamic_extent to support both dynamic and statically sized capacities.
  • Enables adding static_set and static_multiset
  • Supports the various insert schemes: packed/back2back/cas+dependent write
  • Switch between scalar/CG operations
  • Stream support everywhere (https://github.com/NVIDIA/cuCollections/issues/65)
  • Consistent use of bitwise_equal
  • Asynchronous size computation (#102, #237 )
  • rehashing (https://github.com/NVIDIA/cuCollections/issues/21)

My current thinking is to create an open_address_impl class that provides an abstraction for a logical array of "slots" and exposes operations on those slots. All the core logic and switching for things like AoS/SoA, atomic_ref/atomic can/should be implemented in this common impl class.

jrhemstad avatar Oct 04 '21 15:10 jrhemstad