Leo Fang
Leo Fang
Capture what I shared with Ralf offline since it was not captured above. I don't think the CTK/non-CTK distinction is super important. Our public API names, such as `find_nvidia_header_directory`, have...
During the meeting today, the suggestion made by @kkraus14 was that we should insert a conversion to `StridedMemoryView` to the end of the loop in `ParamHolder` so that it is...
@ayermolo is this something you can help with? 🙂
Yes. Our eventual goal (https://github.com/NVIDIA/numba-cuda/issues/128) is to replace everything that numba-cuda has internally to wrap CUDA functionalities by cuda-core, so once `Program` is ready for consuming and compiling NVVM IR...
Whenever you get some free cycles I guess? 🙂
Example: https://github.com/NVIDIA/cuda-python/blob/027ba105fc33ac148c6d7343a9f71801e9f5ff72/cuda_core/cuda/core/experimental/_launcher.pyx#L20-L35
Sebastian's comment here is relevant: https://github.com/NVIDIA/cuda-python/pull/1390#discussion_r2626380632
> `numba-cuda` needs to [be able to find `libcudadevrt.a`](https://github.com/NVIDIA/numba-cuda/blob/e86aeab63d0aaa468abc0529f06b63550c42298a/numba_cuda/numba/cuda/codegen.py#L261-L264) to launch kernels involving cooperative groups @brandon-b-miller could you confirm if this is true? AFAIK cooperative launch does not require libcudadevrt....
That's why I would like us to test it locally before determining the priority, because we have the same `grid.sync()` test in the CI and I am pretty sure libcudadevrt...
It seems this is needed when `-rdc=true` is passed. Does numba-cuda do that always? I forgot.