Leo Fang

Results 278 issues of Leo Fang

The fun part would be: How to keep a generic Python object alive? https://docs.nvidia.com/cuda/cuda-programming-guide/04-special-topics/cuda-graphs.html#cuda-user-objects

triage
feature
cuda.core

For example, we released `cuda-bindings` and `cuda-python` 13.1.0 yesterday, but we did not add `13.1.0-notes.rst` to https://github.com/NVIDIA/cuda-python/tree/main/cuda_python/docs/source/release.

bug
triage
P0
CI/CD

Currently this is low priority because there is no such thing like "libtile", only `tileiras` which is an executable. We prefer in-process compilation through compiler libraries over subprocess calls to...

P1
feature
cuda.core
blocked

Capturing feedbacks provided by @xiakun-lu offline. The NCCL team noticed that `uv sync` complains `nccl4py[cu12]` and `nccl4py[cu13]` are incompatible (`uv venv && uv pip install -e .` works out of...

support
triage
cuda.bindings
cuda.core

Instead of relying on stream capturing, which is considered an implementation detail (that in the future we could allow users to opt in or out), our graph builder APIs were...

enhancement
triage
cuda.core

Follow-up of https://github.com/NVIDIA/cuda-python/pull/1216. We currently test the NVRTC path, but `cuda.core.Program` also covers libNVVM (and nvJitLink!) and we should get them tested too.

triage
test
cuda.core

- many cuda.core operations do not actually need an active CUDA context (aka a device that is set to current) - some just need CUDA to be initialized, meaning `cuInit(0)`...

documentation
triage
cuda.core

- cuda.core does not allow any side calls to `cudaDeviceReset` or alike that tear down the primary contexts - avoid multiple frees of the same buffer in the child process...

documentation
triage
cuda.core