Better names for CUDA kernels
Is your feature request related to a problem? Please describe.
It's difficult to match CUDA kernel names in profiles with locations in the code:
Describe the solution you'd like
You can pass name to an @cuda call (https://cuda.juliagpu.org/stable/api/compiler/#CUDA.cufunction),
Ideally it would give the broadcast expression and a file:lineno where it occurred, but it probably isn't possible (broadcasted objects don't capture it) so the next best thing might be some simple summary of the broadcast tree.
@vchuravy any ideas or suggestions?
So the idea of the mangling is explicitly that the Nvidia tools can give you better results by demangling it so that the arguments are readable. cc: @maleadt