CuArrays.CURAND.curand missing methods
There seems to be a few missing methods for curand. The no argument method returns an array, but should it? And the range method is not defined:
julia> x = CuArrays.CURAND.curand()
0-dimensional CuArray{Float32,0}:
0.15776986
julia> x = CuArrays.CURAND.curand(-1:2:1)
ERROR: MethodError: no method matching CuArray{Float32,N} where N(::UndefInitializer, ::StepRange{Int64,Int64})
Closest candidates are:
CuArray{Float32,N} where N(::UndefInitializer, ::Tuple{Vararg{Int64,N}}) where {T, N} at C:\Users\user\.julia\dev\CuArrays\src\array.jl:45
CuArray{Float32,N} where N(::UndefInitializer, ::Integer...) where T at C:\Users\user\.julia\dev\CuArrays\src\array.jl:46
CuArray{Float32,N} where N(::Function, ::Any...) where T at C:\Users\user\.julia\dev\CuArrays\src\array.jl:56
Stacktrace:
[1] #curand#22(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Type{Float32}, ::StepRange{Int64,Int64}, ::Vararg{Any,N} where N) at C:\Users\user\.julia\dev\CuArrays\src\rand\highlevel.jl:81
[2] curand(::Type{Float32}, ::StepRange{Int64,Int64}) at C:\Users\user\.julia\dev\CuArrays\src\rand\highlevel.jl:81
[3] #curand#26(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::StepRange{Int64,Int64}) at C:\Users\user\.julia\dev\CuArrays\src\rand\highlevel.jl:87
[4] curand(::StepRange{Int64,Int64}) at C:\Users\user\.julia\dev\CuArrays\src\rand\highlevel.jl:87
[5] top-level scope at none:0
I am sure there are more, but these are the ones that came up in the context of https://github.com/luca-aki/TraceEstimation.jl, where an input rand function is to be given that is assumed to have the above methods defined.
CC: @luca-aki
We aim to be identical to Base (see eg. these tests: https://github.com/JuliaGPU/GPUArrays.jl/blob/71a666f27707558d0af6cceb88409d6234e8e74a/src/testsuite/construction.jl#L64-L91), so these should be fixed indeed.
Upon closer inspection, I'm not sure about this. Scalars are inherently GPU-"incompatible", so I don't think it makes much sense to support that API. CuArrays is all about arrays, I don't think we should implement a GPU-version of a non array API, especially if it's not working efficiently on GPUs at all.
Ranges seem more sensible, but tricky since we don't have any of the Sampler machinery.
Another one that doesn't seem to work
julia> X = CuArrays.rand(Bool, 5)
[ Info: Building the CUDAnative run-time library for your sm_70 device, this might take a while...
ERROR: InvalidIRError: compiling JuliaGPU/CuArrays.jl#68(CuArrays.CuKernelState, CUDAnative.CuDeviceArray{NTuple{4,UInt32},1,CUDAnative.AS.Global}, CUDAnative.CuDeviceArray{Bool,1,CUDAnative.AS.Global}) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to floattype)
Stacktrace:
[1] gpu_rand at /home/tim/Julia/pkg/GPUArrays/src/random.jl:47
[2] JuliaGPU/CuArrays.jl#68 at /home/tim/Julia/pkg/GPUArrays/src/random.jl:81
Stacktrace:
[1] check_ir(::CUDAnative.CompilerJob, ::LLVM.Module) at /home/tim/Julia/pkg/CUDAnative/src/compiler/validation.jl:114
[2] macro expansion at /home/tim/Julia/pkg/CUDAnative/src/compiler/driver.jl:188 [inlined]
[3] macro expansion at /home/tim/Julia/depot/packages/TimerOutputs/7zSea/src/TimerOutput.jl:216 [inlined]
[4] codegen(::Symbol, ::CUDAnative.CompilerJob; libraries::Bool, dynamic_parallelism::Bool, optimize::Bool, strip::Bool, strict::Bool) at /home/tim/Julia/pkg/CUDAnative/src/compiler/driver.jl:186
[5] compile(::Symbol, ::CUDAnative.CompilerJob; libraries::Bool, dynamic_parallelism::Bool, optimize::Bool, strip::Bool, strict::Bool) at /home/tim/Julia/pkg/CUDAnative/src/compiler/driver.jl:47
[6] #compile#134 at /home/tim/Julia/pkg/CUDAnative/src/compiler/driver.jl:36 [inlined]
[7] macro expansion at /home/tim/Julia/pkg/CUDAnative/src/execution.jl:389 [inlined]
[8] cufunction(::GPUArrays.var"#68#69"{Bool}, ::Type{Tuple{CuArrays.CuKernelState,CUDAnative.CuDeviceArray{NTuple{4,UInt32},1,CUDAnative.AS.Global},CUDAnative.CuDeviceArray{Bool,1,CUDAnative.AS.Global}}}; name::Nothing, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/tim/Julia/pkg/CUDAnative/src/execution.jl:357
[9] cufunction(::Function, ::Type) at /home/tim/Julia/pkg/CUDAnative/src/execution.jl:357
[10] macro expansion at /home/tim/Julia/pkg/CUDAnative/src/execution.jl:174 [inlined]
[11] macro expansion at ./gcutils.jl:105 [inlined]
[12] macro expansion at /home/tim/Julia/pkg/CUDAnative/src/execution.jl:171 [inlined]
[13] _gpu_call(::CuArrays.CuArrayBackend, ::Function, ::CuArray{Bool,1}, ::Tuple{CuArray{NTuple{4,UInt32},1},CuArray{Bool,1}}, ::Tuple{Tuple{Int64},Tuple{Int64}}) at /home/tim/Julia/pkg/CuArrays/src/gpuarray_interface.jl:60
[14] gpu_call at /home/tim/Julia/pkg/GPUArrays/src/abstract_gpu_interface.jl:151 [inlined]
[15] gpu_call(::Function, ::CuArray{Bool,1}, ::Tuple{CuArray{NTuple{4,UInt32},1},CuArray{Bool,1}}) at /home/tim/Julia/pkg/GPUArrays/src/abstract_gpu_interface.jl:128
[16] rand!(::GPUArrays.RNG, ::CuArray{Bool,1}) at /home/tim/Julia/pkg/GPUArrays/src/random.jl:78
[17] rand!(::CuArray{Bool,1}) at /home/tim/Julia/pkg/CuArrays/src/rand/random.jl:176
[18] rand(::Type, ::Int64) at /home/tim/Julia/pkg/CuArrays/src/rand/random.jl:191
[19] top-level scope at REPL[3]:1
Just tried this again, seems to still be an issue:
julia> CUDA.rand(Bool, 10)
ERROR: StackOverflowError:
Stacktrace:
[1] macro expansion
@ ~/.julia/dev/CUDA/lib/cudadrv/libcuda.jl:84 [inlined]
[2] macro expansion
@ ~/.julia/dev/CUDA/lib/cudadrv/error.jl:97 [inlined]
[3] cuDevicePrimaryCtxRetain
@ ~/.julia/dev/CUDA/lib/utils/call.jl:26 [inlined]
[4] CuContext
@ ~/.julia/dev/CUDA/lib/cudadrv/context.jl:57 [inlined]
[5] context(dev::CuDevice)
@ CUDA ~/.julia/dev/CUDA/lib/cudadrv/state.jl:222
[6] CUDA.TaskLocalState(dev::CuDevice) (repeats 2 times)
@ CUDA ~/.julia/dev/CUDA/lib/cudadrv/state.jl:50
[7] task_local_state!()
@ CUDA ~/.julia/dev/CUDA/lib/cudadrv/state.jl:73
[8] prepare_cuda_state()
@ CUDA ~/.julia/dev/CUDA/lib/cudadrv/state.jl:88
[9] initialize_context()
@ CUDA ~/.julia/dev/CUDA/lib/cudadrv/error.jl:80
[10] macro expansion
@ ~/.julia/dev/CUDA/lib/cudadrv/libcuda.jl:344 [inlined]
[11] macro expansion
@ ~/.julia/dev/CUDA/lib/cudadrv/error.jl:97 [inlined]
[12] cuMemGetInfo_v2(free::Base.RefValue{UInt64}, total::Base.RefValue{UInt64})
@ CUDA ~/.julia/dev/CUDA/lib/utils/call.jl:26
[13] info()
@ CUDA.Mem ~/.julia/dev/CUDA/lib/cudadrv/memory.jl:754
[14] CUDA.MemoryInfo()
@ CUDA ~/.julia/dev/CUDA/src/pool.jl:145
[15] OutOfGPUMemoryError(sz::Int64) (repeats 2 times)
@ CUDA ~/.julia/dev/CUDA/src/pool.jl:198
[16] throw_api_error(res::CUDA.cudaError_enum)
@ CUDA ~/.julia/dev/CUDA/lib/cudadrv/error.jl:89
[17] macro expansion
@ ~/.julia/dev/CUDA/lib/cudadrv/error.jl:101 [inlined]