Leo Fang
Leo Fang
tl;dr: For the Python 3.13 free-threading build (`cp313t`), the per-thread default stream is enabled and used by default. Users need to set `CUDA_PYTHON_CUDA_PER_THREAD_DEFAULT_STREAM=0` to explicitly opt out and restore the...
It seems hatchling (according to @rwgk from https://github.com/NVIDIA/cccl/pull/3201#issue-2751245974) and pip (#476) are confused by the conflict between our intention of `cuda` being a namespace package and `cuda/__init__.py` still exists today.
During investigation of #454 and related doc rendering issues, it has come to my attention that the type checking detection logic does not work well with Sphinx (ex: https://github.com/sphinx-doc/sphinx/issues/13137, https://github.com/sphinx-doc/sphinx/issues/11225)....
I think Sphinx is able to figure out the types and keep them in the document by inspecting the typing info, which going forward should be our source of truth...
Prior art: https://docs.cupy.dev/en/stable/reference/cuda.html#texture-and-surface-memory
Currently this is blocked by MSVC not pre-installed on the VM image that we use (#457). Based on the past experience (#267), it takes way too long to just install...
Not all PRs would require full build/test pipelines (as we currently do today). For example, if a PR only touches code in `cuda.core`, then 1. we don't need to rebuild...
Things to note - this is an experimental build that we reserve the right to stop offering any time (just like the CPython cp313t build is!) - the change in...
We need to continue providing the GIL build and additionally a free-threading (no-GIL) build.