Leo Fang

Results 278 issues of Leo Fang

It was an extremely rare (and pleasant!) case that we had a smooth transition to CUDA 13 #9286 #9289 with the bundled CCCL version happening to support **THREE** CUDA major...

cat:enhancement
prio:high

Currently CuPy supports CUB and cuTENSOR as backends. Originally they were used only to speed up reduction routines like sum and prod, but later extended to several other things. I...

st:needs-discussion

Follow-up of #8442. In #8442 we left room like [this](https://github.com/cupy/cupy/blob/b73ba5352850a84d1a86bcdad127619fee892fbc/cupy/_core/core.pyx#L397) and [this](https://github.com/cupy/cupy/blob/b73ba5352850a84d1a86bcdad127619fee892fbc/cupy/cuda/memory.pyx#L953-L956) to also support using managed memory for UMP. However, there are a few minor kinks we need to...

cat:enhancement
prio:medium

### Describe your issue The offending lines are calls like this: ```c++ (void)__Pyx_PyObject_FastCallDict; (void)__Pyx_VectorcallBuilder_AddArgStr; (void) __Pyx_PyObject_GetMethod; (void) __Pyx_PyObject_CallOneArg; (void) __Pyx_PyObject_CallNoArg; ```

`cuda.core` is an official CUDA Python project: https://nvidia.github.io/cuda-python/cuda-core/latest/index.html. It offers a pythonic, self-contained, lightweight, and official interface over the CUDA programming model. For new Python projects, we encourage them to...

This task should be simple: - Allow users to opt-in and internally pass `cudaGraphInstantiateFlagDeviceLaunch` to `cudaGraphInstantiate()` - Add code samples to showcase how this can done The majority of work...

P0
feature
cuda.core

### Tasks for cuda-core and cuda-bindings patch release - [ ] File an internal nvbug to communicate test plan & release schedule with QA - [x] Ensure all pending PRs...

P0
cuda.core

- Explain the build system change/expectation - Explain a typical development cycle (pure Python -> Cython -> C++) - Explain the steps for cythonization

documentation
P1
cuda.core

### Is this a duplicate? - [x] I confirmed there appear to be no [duplicate issues](https://github.com/NVIDIA/cuda-python/issues) for this bug and that I agree to the [Code of Conduct](CODE_OF_CONDUCT.md) ### Type...

bug
P0
cuda.core

This https://github.com/NVIDIA/cuda-python/blob/da7eb1f5a97aa21d8f78098e13e7c4edad013530/cuda_core/tests/test_program.py#L58-L132 has a real need (ex: https://github.com/NVIDIA/numba-cuda/pull/681). We should expose it to `cuda.bindings.utils`, similar to the PTX version helpers.

P1
feature
cuda.bindings