[ENHANCEMENT]: Update build-time check since the trunk CCCL allows 1B/2B `atomic_ref`
Is your feature request related to a problem? Please describe.
https://godbolt.org/z/a78x4qrha
Originally posted by @sleeepyjack https://github.com/NVIDIA/cuCollections/pull/549#discussion_r1684974753
Now cuda::atomic_ref can work with 1-byte and 2-byte variables so we need to update the static asserts accordingly in cuco data structures, e.g.
https://github.com/NVIDIA/cuCollections/blob/a7f87ac838320fe5b9fc9a90847d3f652ccec201/include/cuco/static_map_ref.cuh#L77-L78
https://github.com/NVIDIA/cuCollections/blob/e7b5a389b823bebe465cefe63ad7b0e95f7fb450/include/cuco/detail/open_addressing/open_addressing_ref_impl.cuh#L1239
Describe the solution you'd like
Update build checks and unit tests
Describe alternatives you've considered
No response
Additional context
No response
Having the same build time checks all over the library isn't ideal. Maybe this is an opportunity to refactor this whole insertion logic into a separate struct slot_inserter that internally decides which insertion strategy to use and also checks if the input types can be handled at all. Let me draft something in Godbolt.
Something like this: https://godbolt.org/z/nvGGYaPWa
This would also solve #547