Leo Fang comments

Results 1175 comments of


                                            Leo Fang

[ENH]: Clean up `SUPPORTED_...` variables between load libs and find header dirs

Capture what I shared with Ralf offline since it was not captured above. I don't think the CTK/non-CTK distinction is super important. Our public API names, such as `find_nvidia_header_directory`, have...

Support converting arbitrary objects to `StridedMemoryView` in `cuda.core.launch()`

During the meeting today, the suggestion made by @kkraus14 was that we should insert a conversion to `StridedMemoryView` to the end of the loop in `ParamHolder` so that it is...

Test `numba_debug` in the libNVVM path too?

@ayermolo is this something you can help with? 🙂

Test `numba_debug` in the libNVVM path too?

Yes. Our eventual goal (https://github.com/NVIDIA/numba-cuda/issues/128) is to replace everything that numba-cuda has internally to wrap CUDA functionalities by cuda-core, so once `Program` is ready for consuming and compiling NVVM IR...

Test `numba_debug` in the libNVVM path too?

Whenever you get some free cycles I guess? 🙂

Address known thread-safety issues in `cuda.core`

Example: https://github.com/NVIDIA/cuda-python/blob/027ba105fc33ac148c6d7343a9f71801e9f5ff72/cuda_core/cuda/core/experimental/_launcher.pyx#L20-L35

Address known thread-safety issues in `cuda.core`

Sebastian's comment here is relevant: https://github.com/NVIDIA/cuda-python/pull/1390#discussion_r2626380632

[FEA]: Support for finding `libcudadevrt.a` through pathfinder

> `numba-cuda` needs to [be able to find `libcudadevrt.a`](https://github.com/NVIDIA/numba-cuda/blob/e86aeab63d0aaa468abc0529f06b63550c42298a/numba_cuda/numba/cuda/codegen.py#L261-L264) to launch kernels involving cooperative groups @brandon-b-miller could you confirm if this is true? AFAIK cooperative launch does not require libcudadevrt....

[FEA]: Support for finding `libcudadevrt.a` through pathfinder

That's why I would like us to test it locally before determining the priority, because we have the same `grid.sync()` test in the CI and I am pretty sure libcudadevrt...

[FEA]: Support for finding `libcudadevrt.a` through pathfinder

It seems this is needed when `-rdc=true` is passed. Does numba-cuda do that always? I forgot.