CUDA.jl icon indicating copy to clipboard operation
CUDA.jl copied to clipboard

Use NVTX.jl to give streams a name

Open simonbyrne opened this issue 2 years ago • 13 comments

We could add CUDA as a WeakDep?

simonbyrne avatar Jan 19 '23 16:01 simonbyrne

See this commented out block: https://github.com/JuliaGPU/CUDA.jl/blob/2ae53a46658f10b0be32ae61fb4e145782ed9208/lib/cudadrv/state.jl#L368-L373

Unfortunately these are not included in our JLL binaries.

simonbyrne avatar Mar 09 '23 21:03 simonbyrne

@maleadt do you know if nvtxNameCuStreamA is available in any of the CUDA packages? If not, we may need to build the NVTX library differently.

simonbyrne avatar Mar 09 '23 23:03 simonbyrne

Ok, these are exposed in the new JLLs. @maleadt do you have a preference if NVTX.jl should have CUDA.jl as a weak dep, or vice versa?

For an interface, I was thinking of adding a function along the following lines:

NVTX.name_resource(stream::CuStream, name::AbstractString)
NVTX.name_resource(stream::CuContext, name::AbstractString)
NVTX.name_resource(stream::CuDevice, name::AbstractString)

simonbyrne avatar Mar 13 '23 18:03 simonbyrne

Adding NVTX.jl as a weak dep to CUDA.jl seems like the more logical option to me, as it's CUDA.jl that needs to name streams when they get created. Or how would you implement the inverse?

maleadt avatar Mar 13 '23 19:03 maleadt

I think you would need users to manually name their own streams?

simonbyrne avatar Mar 13 '23 19:03 simonbyrne

Adding NVTX.jl as a weak dep to CUDA.jl seems like the more logical option to me, as it's CUDA.jl that needs to name streams when they get created.

Alternatively: you could just add NVTX_jll as a dependency to CUDA, and directly ccall it there, similar to what is currently commented out: https://github.com/JuliaGPU/CUDA.jl/blob/2ae53a46658f10b0be32ae61fb4e145782ed9208/lib/cudadrv/state.jl#L368-L373

We could also then define the above functions here (with CUDA as a weak dep) for users who want to give more descriptive names to their streams?

simonbyrne avatar Mar 13 '23 19:03 simonbyrne

I think you would need users to manually name their own streams?

Why? We can have CUDA at least provide a somewhat better name on stream construction, and the user can always improve afterwards.

Alternatively: you could just add NVTX_jll as a dependency to CUDA, and directly ccall it there, similar to what is currently commented out: https://github.com/JuliaGPU/CUDA.jl/blob/2ae53a46658f10b0be32ae61fb4e145782ed9208/lib/cudadrv/state.jl#L368-L373

Why can't we do that through a weakdep? If the user isn't using NVTX.jl, I assume he wouldn't be interested in user-friendly names for CUDA.jl's streams, no?

maleadt avatar Mar 15 '23 11:03 maleadt

Can you call weakdeps from inside functions in the package? Or how would it work?

simonbyrne avatar Mar 15 '23 19:03 simonbyrne

Add name_stream(x) = nothing to CUDA.jl and have the weakdep provide a more specific name_stream(x::CuStream) = NVTX....? Or do something dynamic a la https://github.com/JuliaGPU/GPUCompiler.jl/blob/661dcef51a93f4ce76cb9446d7d8670cd97ea8ca/src/reflection.jl#L65-L69. @KristofferC may know what the best option is here (summary: we want CUDA.jl to call functionality from NVTX.jl when the package is available, using weakdeps so that we can express compatibility constraints).

maleadt avatar Mar 16 '23 08:03 maleadt

(summary: we want CUDA.jl to call functionality from NVTX.jl when the package is available, using weakdeps so that we can express compatibility constraints).

If by "available" you mean that the package is loaded, you have two options:

  1. Overload some method in CUDA.jl in the extension. For example, have a default definition stream_name(x) = "default name and in the extension have stream_name(x::CudaStream) = NVTX.cool_stream_name() and always call it with a CudaStream.
  2. Do a dynamic check if the extension is loaded using Base.get_extension.

Extensions work best if the user provides something (like a type) in the weak dependency to dispatch on but that doesn't seem to be the case here.

KristofferC avatar Mar 16 '23 14:03 KristofferC

Since CUDA.jl now depends on NVTX.jl, how about the following:

  • add a function stub (name_resource?) to NVTX.jjl
  • add the methods in CUDA.jl

simonbyrne avatar Aug 25 '23 16:08 simonbyrne

Wouldn't it be simpler to add a high-level API for nvtxNameCuStream to NVTX.jl and have CUDA.jl call that on stream construction?

maleadt avatar Aug 25 '23 18:08 maleadt

CuStream is defined in CUDA.jl. We could do it via package extensions, but since CUDA.jl loads NVTX.jl, it seems like asker to do it here.

simonbyrne avatar Aug 25 '23 20:08 simonbyrne