Michael Schellenberger Costa
Michael Schellenberger Costa
> I tried removing the tuning patches in our branch-25.06 build locally which uses CCCL 2.7.0 and had it appears these patches are still desired. First, removing the `dispatch_reduce.cuh` patch...
@Artem-B am I remembering correctly that you were seeing some issues with those tests as well?
/ok to test 960fcb4
/ok to test a34947e
/ok to test 8b19091
/ok to test 43bbd9b
@frederick-vs-ja Thanks a lot for the reduced bug report. I reworked the tests so that we do not trigger the bug. Will wait until Monday then merge
> would it ever make sense to pass an async memory resource to `vector`? or should we have a separate `async_vector` for that? I believe it should be a separate...
@Artem-B So it seems SM_90a is not supported, which makes CI fail: > clang++: warning: CUDA version is newer than the latest partially supported version 12.1 [-Wunknown-cuda-version] > clang++: error:...
Also looks like we need to revisit our PTX detection scheme for clang: > ptxas /tmp/ptx-sm_90-4a12ff.s, line 116; error : Feature 'tensormap.cp_fenceproxy' requires PTX ISA .version 8.3 or later >...