Yunsong Wang
Yunsong Wang
> The performance in this case should be comparable to having a global memory hash table implicitly cached in L1. This assumption is invalid. Closing the issue.
Using `cuda::std::span` may lose the benefit of the static extent.
@sleeepyjack Is this still desired?
Notes from previous discussions: - Related blog: https://www.foonathan.net/2020/03/iterator-sentinel/ - iterator sentinel with custom range ```cpp for(auto slot_content : custom_range(Probe(key))){ ... if (match(slot_content, key)) return true; } return false; // reach...
Distincting probing scheme and storage: - Probing scheme provides int-type indices - Storage provides a pointer to the storage - Probing scheme should be a callable object https://godbolt.org/z/n8rrxer47
CG, range, and probing iterators: https://godbolt.org/z/a1hqvErfv :fire: :fire: :fire:
> Yes, the error you are seeing is due to us struggling with getting CTAD right 🙃 I wonder if this is something we can fix with a deduction guide?...
https://godbolt.org/z/4Y4WYfxcK @esoha-nvidia actually, your example would deduce just fine even with the current implementation. Any other examples you would like to make them work?
Updates: now with #346 being merged. Plain integers can be deduced as well. https://godbolt.org/z/Wsbr17fh5
The test failure may be related to a recent change from https://github.com/NVIDIA/cuCollections/pull/394.