Leo Fang
Leo Fang
@ichergui how did it go?
@ichergui could you try: - build `cuda-python` 12.4.0 with CTK 12.4.0 - run the generated wheel with CTK 12.2.0
Preferably trap is not used in device code: https://github.com/NVIDIA/cccl/issues/939#issuecomment-1802542072
It'd be nice to also get this page updated: https://docs.conda.io/projects/conda-build/en/latest/resources/package-spec.html
@prutschman-iv feel free to file a draft PR so that one of us can look at it and help you verify the issues :)
cc: @essoca for vis
To me Tyler's experience is quite common for code porting from CPU to GPU, way before Array API was a thing (e.g. CuPy offers a tutorial for writing [CPU/GPU agnostic...
If it's not added, and there is no plan to add it, what's wrong with "not planned"?
> > Does the proposed inspection API provide enough value to downstream libraries to justify inclusion? Notably, this proposal does not seek to standardize any keywords or stride-specific APIs [...]...
If we want to say "having `__dlpack__` and `__dlpack_device__` implemented is *the* proxy to check if strides exist," I am fine with it too, but we should document this. But...