Daniel Jünger
Daniel Jünger
I totally get your point about avoiding writing a kernel that is already 90% existent in cuco. It's really about API flexibility: - Our host bulk APIs aren't very flexible,...
Correction: We use the `memcpy_async` approach only in multimap. For `static_map`/`static_set` we store the output directly: https://github.com/NVIDIA/cuCollections/blob/359f5ae67e93b69a8df35ebd1d12f746aac8916e/include/cuco/detail/static_map/kernels.cuh#L121 So using a fancy iterator should work in this scenario.
Yes, the error you are seeing is due to us struggling with getting CTAD right 🙃 I wonder if this is something we can fix with a deduction guide?
Marking this as a draft since there are several details I'd like to discuss in the reviews. Also, some docs are still missing.
> Why is it a factory instead of just a constructor? Factory functions are largely defunct as of C++17 and the introduction of CTAD. I fiddled around with this and...
Superseeded by #515
Yeah, that's true. Maybe we could push a development tag `0.1.0` to `dev` and use that as an intermediate?
Ok, then we can put this issue in the backlog for now.
Reopening as this is part of NVIDIA/libcudacxx#110 and the associated milestone
What's the state of this PR? This could be used to automatically catch bugs such as #510.