cuda-python
cuda-python copied to clipboard
CUDA Python: Performance meets Productivity
We should also rename `PARALLEL_LEVEL` to, e.g., `CUDA_PYTHON_PARALLEL_LEVEL` https://github.com/NVIDIA/cuda-python/blob/e1e332564c48db556212d59262a149b1a63285e8/setup.py#L31
> I'd also like to use this opportunity to purge the doc artifacts from the `main`/release branches, following the common practice. In short: > - The `main`/release branches should only...
```shell cuda/bindings/_lib/ccudart/ccudart.cpp: In function 'cudaError_t __pyx_f_4cuda_8bindings_4_lib_7ccudart_7ccudart__cudaGraphExecGetFlags(cudaGraphExec_t, long long unsigned int*)': cuda/bindings/_lib/ccudart/ccudart.cpp:52386:108: warning: invalid conversion from 'long long unsigned int*' to 'cuuint64_t*' {aka 'long unsigned int*'} [-fpermissive] 52386 | __pyx_t_3 =...
Currently CUDA Python has a Cython-based reimplementation of CUDA runtime on top of the CUDA driver APIs. This is significant effort in terms of maintenance, requiring a lot of engineering...
OS/Archs: ```[tasklist] ### Tasks - [x] linux-64, 1 GPU - [ ] linux-aarch64, 1 GPU - [ ] win-64, 1 GPU - [ ] linux-64, 2 GPUs - [ ]...
xref: #70 This is a highly experimental feature currently for preview purposes only, not for production use. The current focus is centering around a correct, robust, and future-proof design of...
**The third iteration of this PR description is:** Closes #453 (expected, but currently not tested) This PR has these main aspects: * All dynamic library loading (except libcuda) is moved...