CLArrays.jl icon indicating copy to clipboard operation
CLArrays.jl copied to clipboard

Error executing simple code: `OpenCL Error: OpenCL.Context error: ??`

Open davidbp opened this issue 7 years ago • 32 comments

Hello,

I have installed the package without any errors and made the following test:

using CLArrays
sizes = [100,500,1000]
for s in sizes
    srand(123)
    X = rand(Float32,s,s)
    Xcl = CLArray(X)
    aux1 = Xcl * Xcl
    println("result of the sum: ", sum(aux1))
end

which results in:

result of the sum: 249076.69
result of the sum: 3.ERROR (unhandled task failure): OpenCL Error: OpenCL.Context error: ??
Stacktrace:
 [1] raise_context_error(::String, ::String) at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/context.jl:109
 [2] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/context.jl:148 [inlined]
 [3] (::OpenCL.cl.##43#44)() at ./task.jl:335
1199882e7
result of the sum: 2.5038744e8

My versioninfo() is;

Julia Version 0.6.2
Commit d386e40c17 (2017-12-13 18:08 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Prescott)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, broadwell)

davidbp avatar Apr 12 '18 08:04 davidbp

I can't reproduce this :( What device are you using? You can use this to find out:

julia> CLArray(rand(10)) |> GPUArrays.device

SimonDanisch avatar Apr 12 '18 09:04 SimonDanisch

I just added println("Device:", GPUArrays.device(CLArray(rand(10))), "\n") at the top of the script and I get

Device:OpenCL.Device(AMD Radeon Pro 580 Compute Engine on Apple @0x0000000001021c00)

result of the sum: 249076.69
result of the sum: 3.ERROR (unhandled task failure): OpenCL Error: OpenCL.Context error: ????????
Stacktrace:
 [1] raise_context_error(::String, ::String) at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/context.jl:109
 [2] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/context.jl:148 [inlined]
 [3] (::OpenCL.cl.##43#44)() at ./task.jl:335
1199882e7
result of the sum: 2.5038744e8

About the device:

julia> CLArrays.init(devs[1])
OpenCL context with:
CL version: OpenCL 1.2 
Device: CL AMD Radeon Pro 580 Compute Engine
            threads: 256
             blocks: (256, 256, 256)
      global_memory: 8589.934592 mb
 free_global_memory: NaN mb
       local_memory: 0.032768 mb

davidbp avatar Apr 12 '18 09:04 davidbp

By the way I just tested the code form the readme in CLArrays.

using CLArrays

for dev in CLArrays.devices()
    CLArrays.init(dev)
    x = zeros(CLArray{Float32}, 5, 5) # create a CLArray on device `dev`
end

# you can also filter with is_gpu, is_cpu
gpu_devices = CLArrays.devices(is_gpu)

This is what I get

ERROR: LoadError: CLError(code=-52, CL_INVALID_KERNEL_ARGS)ERROR (unhandled task failure): OpenCL Error: OpenCL.Context error: ??^??
Stacktrace:
 [1] raise_context_error(::String, ::String) at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/context.jl:109
 [2] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/context.jl:148 [inlined]
 [3] (::OpenCL.cl.##43#44)() at ./task.jl:335
ERROR (unhandled task failure): OpenCL Error: OpenCL.Context error: 

Stacktrace:
 [1] raise_context_error(::String, ::String) at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/context.jl:109
 [2] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/context.jl:148 [inlined]
 [3] (::OpenCL.cl.##43#44)() at ./task.jl:335

Stacktrace:
 [1] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/macros.jl:6 [inlined]
 [2] (::CLArrays.CLFunction{GPUArrays.#const_kernel2,Tuple{CLArrays.KernelState,CLArrays.CLArray{Float32,2},Float32,UInt32},Tuple{(2, :ptr)}})(::Tuple{CLArrays.KernelState,CLArrays.CLArray{Float32,2},Float32,UInt32}, ::Tuple{Int64}, ::Tuple{Int64}, ::OpenCL.cl.CmdQueue) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:288
 [3] (::CLArrays.CLFunction{GPUArrays.#const_kernel2,Tuple{CLArrays.KernelState,CLArrays.CLArray{Float32,2},Float32,UInt32},Tuple{(2, :ptr)}})(::Tuple{CLArrays.KernelState,CLArrays.CLArray{Float32,2},Float32,UInt32}, ::Tuple{Int64}, ::Tuple{Int64}) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:272
 [4] _gpu_call(::Function, ::CLArrays.CLArray{Float32,2}, ::Tuple{CLArrays.CLArray{Float32,2},Float32,UInt32}, ::Tuple{Tuple{Int64},Tuple{Int64}}) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:18
 [5] gpu_call(::Function, ::CLArrays.CLArray{Float32,2}, ::Tuple{CLArrays.CLArray{Float32,2},Float32,UInt32}, ::Int64) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/abstract_gpu_interface.jl:151
 [6] fill!(::CLArrays.CLArray{Float32,2}, ::Float32) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/construction.jl:14
 [7] macro expansion at /Users/davidbuchaca1/Documents/git_stuff/learn_julia/GPU_compute/CLArrays/example_from_clarrays_readme.jl:5 [inlined]
 [8] anonymous at ./<missing>:?
 [9] include_from_node1(::String) at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
 [10] include(::String) at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
 [11] process_options(::Base.JLOptions) at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
 [12] _start() at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
while loading /Users/davidbuchaca1/Documents/git_stuff/learn_julia/GPU_compute/CLArrays/example_from_clarrays_readme.jl, in expression starting on line 3

davidbp avatar Apr 12 '18 09:04 davidbp

That's hopefully for the CPU OpenCL implementation, is it?

SimonDanisch avatar Apr 12 '18 09:04 SimonDanisch

Yes it is opencl CPU. This tiny test shows us that in the GPU there is no problem. Nevertheless it seems that the probem is the sintax of making an array of zeros with type CLArray{Float32}.

julia> CLArrays.init(CLArrays.devices()[2])
OpenCL context with:
CL version: OpenCL 1.2 
Device: CL Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
            threads: 1024
             blocks: (1024, 1, 1)
      global_memory: 68719.476736 mb
 free_global_memory: NaN mb
       local_memory: 0.032768 mb


julia> x = zeros(CLArray{Float32}, 5, 5)
ERROR: CLError(code=-52, CL_INVALID_KERNEL_ARGS)
Stacktrace:
 [1] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/macros.jl:6 [inlined]
 [2] (::CLArrays.CLFunction{GPUArrays.#const_kernel2,Tuple{CLArrays.KernelState,CLArrays.CLArray{Float32,2},Float32,UInt32},Tuple{(2, :ptr)}})(::Tuple{CLArrays.KernelState,CLArrays.CLArray{Float32,2},Float32,UInt32}, ::Tuple{Int64}, ::Tuple{Int64}, ::OpenCL.cl.CmdQueue) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:288
 [3] (::CLArrays.CLFunction{GPUArrays.#const_kernel2,Tuple{CLArrays.KernelState,CLArrays.CLArray{Float32,2},Float32,UInt32},Tuple{(2, :ptr)}})(::Tuple{CLArrays.KernelState,CLArrays.CLArray{Float32,2},Float32,UInt32}, ::Tuple{Int64}, ::Tuple{Int64}) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:272
 [4] _gpu_call(::Function, ::CLArrays.CLArray{Float32,2}, ::Tuple{CLArrays.CLArray{Float32,2},Float32,UInt32}, ::Tuple{Tuple{Int64},Tuple{Int64}}) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:18
 [5] gpu_call(::Function, ::CLArrays.CLArray{Float32,2}, ::Tuple{CLArrays.CLArray{Float32,2},Float32,UInt32}, ::Int64) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/abstract_gpu_interface.jl:151
 [6] fill!(::CLArrays.CLArray{Float32,2}, ::Float32) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/construction.jl:14
 [7] zeros(::Type{CLArrays.CLArray{Float32,N} where N}, ::Tuple{Int64,Int64}) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/construction.jl:18
 [8] zeros(::Type{T} where T, ::Int64, ::Int64) at ./array.jl:265

julia> CLArrays.init(CLArrays.devices()[1])
OpenCL context with:
CL version: OpenCL 1.2 
Device: CL AMD Radeon Pro 580 Compute Engine
            threads: 256
             blocks: (256, 256, 256)
      global_memory: 8589.934592 mb
 free_global_memory: NaN mb
       local_memory: 0.032768 mb


julia> x = zeros(CLArray{Float32}, 5, 5)
GPU: 5×5 Array{Float32,2}:
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0

Maybe you could update the readme with this code that will not break

julia> using CLArrays
julia> for dev in CLArrays.devices()
           CLArrays.init(dev)
           println("Current dev:", dev)
           x = CLArray(zeros(Float32, 5, 5)) # create a CLArray on device `dev`
       end
Current dev:OpenCL.Device(AMD Radeon Pro 580 Compute Engine on Apple @0x0000000001021c00)
Current dev:OpenCL.Device(Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz on Apple @0x00000000ffffffff)

julia> # you can also filter with is_gpu, is_cpu
       gpu_devices = CLArrays.devices(is_gpu)
1-element Array{OpenCL.cl.Device,1}:
 OpenCL.Device(AMD Radeon Pro 580 Compute Engine on Apple @0x0000000001021c00)

It seems that zeros(CLArray{Foat32},5,5) is the problem. If you change it by CLArray(zeros(Float32, 5, 5)) then it works fine.

davidbp avatar Apr 12 '18 09:04 davidbp

This nevertheless does not give us any hint on what is going on in the code at the top of the issue. Maybe it has something to do with memory management.

Is there any way to check how much ram is allocated by CLArrays at a given time in the GPU of the device?

This would be interesting to manually free memory.

davidbp avatar Apr 12 '18 09:04 davidbp

The CPU drivers are pretty buggy, which is why i've given up to support them for now... maybe i should add that to the readme ;) I dont think this is a memory issue... it's very likely just a driver difference... Apple + amd is exotic enough

SimonDanisch avatar Apr 12 '18 09:04 SimonDanisch

Yeah, just state that the package is designed for GPUs (and maybe not even show in CLArrays.devices the CPU since it is not meant to be used by CLArrays anyway).

About the driver... I have no idea. I only know apple is currently using opencl 1.2. CLArrays uses OpenCL.jl maybe if this package is based on opencl 2.0 then there might be driver problems.

davidbp avatar Apr 12 '18 09:04 davidbp

If I try it with sizes sizes = [100,110,120] it works:

Device:OpenCL.Device(AMD Radeon Pro 580 Compute Engine on Apple @0x0000000001021c00)

result of the sum: 249076.69
result of the sum: 330530.9
result of the sum: 430043.97

Once I go to bigger sizes it doesn't work anymore.

davidbp avatar Apr 12 '18 10:04 davidbp

weird... can you run the testsuite, so we can have a baseline of whats working generally?

SimonDanisch avatar Apr 12 '18 13:04 SimonDanisch

Here there is the Pkg.test() (not passed...)

julia> Pkg.test("CLArrays")
INFO: Testing CLArrays
Test Summary:                                                                         | Pass  TotalERROR (unhandled task failure): OpenCL Error: OpenCL.Context error: 
Stacktrace:
 [1] raise_context_error(::String, ::String) at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/context.jl:109
 [2] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/context.jl:148 [inlined]
 [3] (::OpenCL.cl.##43#44)() at ./task.jl:335

Device: OpenCL.Device(AMD Radeon Pro 580 Compute Engine on Apple @0x0000000001021c00) |  745    745
mapidx: Error During Test
  Got an exception of type OpenCL.cl.CLError outside of a @testERROR (unhandled task failure): OpenCL Error: OpenCL.Context error: 
Stacktrace:
 [1] raise_context_error(::String, ::String) at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/context.jl:109
 [2] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/context.jl:148 [inlined]
 [3] (::OpenCL.cl.##43#44)() at ./task.jl:335

  CLError(code=-52, CL_INVALID_KERNEL_ARGS)
  Stacktrace:
   [1] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/macros.jl:6 [inlined]
   [2] (::CLArrays.CLFunction{GPUArrays.#mapidx_kernel,Tuple{CLArrays.KernelState,GPUArrays.TestSuite.##97#106,CLArrays.CLArray{Complex{Float32},1},Tuple{CLArrays.CLArray{Complex{Float32},1},UInt32,UInt32}},Tuple{(3, :ptr),(4, 1, :ptr)}})(::Tuple{CLArrays.KernelState,GPUArrays.TestSuite.##97#106,CLArrays.CLArray{Complex{Float32},1},Tuple{CLArrays.CLArray{Complex{Float32},1},UInt32,UInt32}}, ::Tuple{Int64}, ::Tuple{Int64}, ::OpenCL.cl.CmdQueue) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:288
   [3] (::CLArrays.CLFunction{GPUArrays.#mapidx_kernel,Tuple{CLArrays.KernelState,GPUArrays.TestSuite.##97#106,CLArrays.CLArray{Complex{Float32},1},Tuple{CLArrays.CLArray{Complex{Float32},1},UInt32,UInt32}},Tuple{(3, :ptr),(4, 1, :ptr)}})(::Tuple{CLArrays.KernelState,GPUArrays.TestSuite.##97#106,CLArrays.CLArray{Complex{Float32},1},Tuple{CLArrays.CLArray{Complex{Float32},1},UInt32,UInt32}}, ::Tuple{Int64}, ::Tuple{Int64}) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:272
   [4] _gpu_call(::Function, ::CLArrays.CLArray{Complex{Float32},1}, ::Tuple{GPUArrays.TestSuite.##97#106,CLArrays.CLArray{Complex{Float32},1},Tuple{CLArrays.CLArray{Complex{Float32},1},UInt32,UInt32}}, ::Tuple{Tuple{Int64},Tuple{Int64}}) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:18
   [5] gpu_call(::Function, ::CLArrays.CLArray{Complex{Float32},1}, ::Tuple{GPUArrays.TestSuite.##97#106,CLArrays.CLArray{Complex{Float32},1},Tuple{CLArrays.CLArray{Complex{Float32},1},UInt32,UInt32}}, ::Int64) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/abstract_gpu_interface.jl:151
   [6] mapidx(::Function, ::CLArrays.CLArray{Complex{Float32},1}, ::Tuple{CLArrays.CLArray{Complex{Float32},1},UInt32,UInt32}) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/broadcast.jl:194
   [7] macro expansion at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/base.jl:41 [inlined]
   [8] macro expansion at ./test.jl:860 [inlined]
   [9] macro expansion at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/base.jl:35 [inlined]
   [10] macro expansion at ./test.jl:860 [inlined]
   [11] run_base(::Type{T} where T) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/base.jl:34
   [12] run_tests(::Type{T} where T) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/testsuite.jl:56
   [13] macro expansion at /Users/davidbuchaca1/.julia/v0.6/CLArrays/test/runtests.jl:8 [inlined]
   [14] macro expansion at ./test.jl:860 [inlined]
   [15] macro expansion at /Users/davidbuchaca1/.julia/v0.6/CLArrays/test/runtests.jl:6 [inlined]
   [16] anonymous at ./<missing>:?
   [17] include_from_node1(::String) at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
   [18] include(::String) at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
   [19] process_options(::Base.JLOptions) at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
   [20] _start() at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
copy!: Error During Test
  Got an exception of type OpenCL.cl.CLError outside of a @test
  CLError(code=-52, CL_INVALID_KERNEL_ARGS)
  Stacktrace:
   [1] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/macros.jl:6 [inlined]
   [2] (::CLArrays.CLFunction{GPUArrays.#copy_kernel!,Tuple{CLArrays.KernelState,CLArrays.CLArray{Float32,2},Tuple{UInt32,UInt32},CLArrays.CLArray{Float32,2},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},UInt32},Tuple{(2, :ptr),(4, :ptr)}})(::Tuple{CLArrays.KernelState,CLArrays.CLArray{Float32,2},Tuple{UInt32,UInt32},CLArrays.CLArray{Float32,2},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},UInt32}, ::Tuple{Int64}, ::Tuple{Int64}, ::OpenCL.cl.CmdQueue) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:288
   [3] (::CLArrays.CLFunction{GPUArrays.#copy_kernel!,Tuple{CLArrays.KernelState,CLArrays.CLArray{Float32,2},Tuple{UInt32,UInt32},CLArrays.CLArray{Float32,2},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},UInt32},Tuple{(2, :ptr),(4, :ptr)}})(::Tuple{CLArrays.KernelState,CLArrays.CLArray{Float32,2},Tuple{UInt32,UInt32},CLArrays.CLArray{Float32,2},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},UInt32}, ::Tuple{Int64}, ::Tuple{Int64}) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:272
   [4] _gpu_call(::Function, ::CLArrays.CLArray{Float32,2}, ::Tuple{CLArrays.CLArray{Float32,2},Tuple{UInt32,UInt32},CLArrays.CLArray{Float32,2},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},UInt32}, ::Tuple{Tuple{Int64},Tuple{Int64}}) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:18
   [5] gpu_call(::Function, ::CLArrays.CLArray{Float32,2}, ::Tuple{CLArrays.CLArray{Float32,2},Tuple{UInt32,UInt32},CLArrays.CLArray{Float32,2},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},Tuple{UInt32,UInt32},UInt32}, ::Int64) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/abstract_gpu_interface.jl:151
   [6] copy!(::CLArrays.CLArray{Float32,2}, ::CartesianRange{CartesianIndex{2}}, ::CLArrays.CLArray{Float32,2}, ::CartesianRange{CartesianIndex{2}}) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/abstractarray.jl:149
   [7] macro expansion at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/base.jl:63 [inlined]
   [8] macro expansion at ./test.jl:860 [inlined]
   [9] macro expansion at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/base.jl:55 [inlined]
   [10] macro expansion at ./test.jl:860 [inlined]
   [11] run_base(::Type{T} where T) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/base.jl:34
   [12] run_tests(::Type{T} where T) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/testsuite.jl:56
   [13] macro expansion at /Users/davidbuchaca1/.julia/v0.6/CLArrays/test/runtests.jl:8 [inlined]
   [14] macro expansion at ./test.jl:860 [inlined]
   [15] macro expansion at /Users/davidbuchaca1/.julia/v0.6/CLArrays/test/runtests.jl:6 [inlined]
   [16] anonymous at ./<missing>:?
   [17] include_from_node1(::String) at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
   [18] include(::String) at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
   [19] process_options(::Base.JLOptions) at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
   [20] _start() at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
cartesian iteration: Error During Test
  Got an exception of type OpenCL.cl.CLError outside of a @test
  CLError(code=-52, CL_INVALID_KERNEL_ARGS)
  Stacktrace:
   [1] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/macros.jl:6 [inlined]
   [2] (::CLArrays.CLFunction{GPUArrays.#const_kernel2,Tuple{CLArrays.KernelState,CLArrays.CLArray{Float32,2},Float32,UInt32},Tuple{(2, :ptr)}})(::Tuple{CLArrays.KernelState,CLArrays.CLArray{Float32,2},Float32,UInt32}, ::Tuple{Int64}, ::Tuple{Int64}, ::OpenCL.cl.CmdQueue) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:288
   [3] (::CLArrays.CLFunction{GPUArrays.#const_kernel2,Tuple{CLArrays.KernelState,CLArrays.CLArray{Float32,2},Float32,UInt32},Tuple{(2, :ptr)}})(::Tuple{CLArrays.KernelState,CLArrays.CLArray{Float32,2},Float32,UInt32}, ::Tuple{Int64}, ::Tuple{Int64}) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:272
   [4] _gpu_call(::Function, ::CLArrays.CLArray{Float32,2}, ::Tuple{CLArrays.CLArray{Float32,2},Float32,UInt32}, ::Tuple{Tuple{Int64},Tuple{Int64}}) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:18
   [5] gpu_call(::Function, ::CLArrays.CLArray{Float32,2}, ::Tuple{CLArrays.CLArray{Float32,2},Float32,UInt32}, ::Int64) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/abstract_gpu_interface.jl:151
   [6] fill!(::CLArrays.CLArray{Float32,2}, ::Float32) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/construction.jl:14
   [7] zeros(::CLArrays.CLArray{Float32,2}) at ./array.jl:261
   [8] macro expansion at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/base.jl:118 [inlined]
   [9] macro expansion at ./test.jl:860 [inlined]
   [10] macro expansion at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/base.jl:115 [inlined]
   [11] macro expansion at ./test.jl:860 [inlined]
   [12] run_base(::Type{T} where T) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/base.jl:34
   [13] run_tests(::Type{T} where T) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/testsuite.jl:56
   [14] macro expansion at /Users/davidbuchaca1/.julia/v0.6/CLArrays/test/runtests.jl:8 [inlined]
   [15] macro expansion at ./test.jl:860 [inlined]
   [16] macro expansion at /Users/davidbuchaca1/.julia/v0.6/CLArrays/test/runtests.jl:6 [inlined]
   [17] anonymous at ./<missing>:?
   [18] include_from_node1(::String) at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
   [19] include(::String) at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
   [20] process_options(::Base.JLOptions) at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
   [21] _start() at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
map: Error During Test
  Got an exception of type OpenCL.cl.CLError outside of a @test
  CLError(code=-52, CL_INVALID_KERNEL_ARGS)
  Stacktrace:
   [1] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/macros.jl:6 [inlined]
   [2] (::CLArrays.CLFunction{GPUArrays.#broadcast_kernel!,Tuple{CLArrays.KernelState,Base.#+,CLArrays.CLArray{Float32,1},Tuple{UInt32},Tuple{GPUArrays.BInfo{Array,1},GPUArrays.BInfo{Array,1}},Tuple{CLArrays.CLArray{Float32,1},CLArrays.CLArray{Float32,1}}},Tuple{(3, :ptr),(6, 1, :ptr),(6, 2, :ptr)}})(::Tuple{CLArrays.KernelState,Base.#+,CLArrays.CLArray{Float32,1},Tuple{UInt32},Tuple{GPUArrays.BInfo{Array,1},GPUArrays.BInfo{Array,1}},Tuple{CLArrays.CLArray{Float32,1},CLArrays.CLArray{Float32,1}}}, ::Tuple{Int64}, ::Tuple{Int64}, ::OpenCL.cl.CmdQueue) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:288
   [3] (::CLArrays.CLFunction{GPUArrays.#broadcast_kernel!,Tuple{CLArrays.KernelState,Base.#+,CLArrays.CLArray{Float32,1},Tuple{UInt32},Tuple{GPUArrays.BInfo{Array,1},GPUArrays.BInfo{Array,1}},Tuple{CLArrays.CLArray{Float32,1},CLArrays.CLArray{Float32,1}}},Tuple{(3, :ptr),(6, 1, :ptr),(6, 2, :ptr)}})(::Tuple{CLArrays.KernelState,Base.#+,CLArrays.CLArray{Float32,1},Tuple{UInt32},Tuple{GPUArrays.BInfo{Array,1},GPUArrays.BInfo{Array,1}},Tuple{CLArrays.CLArray{Float32,1},CLArrays.CLArray{Float32,1}}}, ::Tuple{Int64}, ::Tuple{Int64}) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:272
   [4] _gpu_call(::Function, ::CLArrays.CLArray{Float32,1}, ::Tuple{Base.#+,CLArrays.CLArray{Float32,1},Tuple{UInt32},Tuple{GPUArrays.BInfo{Array,1},GPUArrays.BInfo{Array,1}},Tuple{CLArrays.CLArray{Float32,1},CLArrays.CLArray{Float32,1}}}, ::Tuple{Tuple{Int64},Tuple{Int64}}) at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:18
   [5] gpu_call(::Function, ::CLArrays.CLArray{Float32,1}, ::Tuple{Base.#+,CLArrays.CLArray{Float32,1},Tuple{UInt32},Tuple{GPUArrays.BInfo{Array,1},GPUArrays.BInfo{Array,1}},Tuple{CLArrays.CLArray{Float32,1},CLArrays.CLArray{Float32,1}}}, ::Int64) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/abstract_gpu_interface.jl:151
   [6] _broadcast!(::Function, ::CLArrays.CLArray{Float32,1}, ::Tuple{Tuple{Bool},Tuple{Bool}}, ::Tuple{Tuple{Int64},Tuple{Int64}}, ::CLArrays.CLArray{Float32,1}, ::Tuple{CLArrays.CLArray{Float32,1}}, ::Type{Val{0}}, ::CartesianRange{CartesianIndex{1}}) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/broadcast.jl:89
   [7] broadcast_t(::Function, ::Type{Float32}, ::Tuple{Base.OneTo{Int64}}, ::CartesianRange{CartesianIndex{1}}, ::CLArrays.CLArray{Float32,1}, ::CLArrays.CLArray{Float32,1}) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/broadcast.jl:58
   [8] broadcast_c at ./broadcast.jl:316 [inlined]
   [9] broadcast at ./broadcast.jl:455 [inlined]
   [10] map(::Function, ::CLArrays.CLArray{Float32,1}, ::CLArrays.CLArray{Float32,1}) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/base.jl:12
   [11] (::GPUArrays.TestSuite.##99#108)(::CLArrays.CLArray{Float32,1}, ::CLArrays.CLArray{Float32,1}) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/base.jl:132
   [12] against_base(::Function, ::Type{T} where T, ::Tuple{Int64}, ::Vararg{Tuple{Int64},N} where N) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/testsuite.jl:28
   [13] macro expansion at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/base.jl:132 [inlined]
   [14] macro expansion at ./test.jl:860 [inlined]
   [15] macro expansion at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/base.jl:131 [inlined]
   [16] macro expansion at ./test.jl:860 [inlined]
   [17] run_base(::Type{T} where T) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/base.jl:34
   [18] run_tests(::Type{T} where T) at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/testsuite.jl:56
   [19] macro expansion at /Users/davidbuchaca1/.julia/v0.6/CLArrays/test/runtests.jl:8 [inlined]
   [20] macro expansion at ./test.jl:860 [inlined]
   [21] macro expansion at /Users/davidbuchaca1/.julia/v0.6/CLArrays/test/runtests.jl:6 [inlined]
   [22] anonymous at ./<missing>:?
   [23] include_from_node1(::String) at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
   [24] include(::String) at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
   [25] process_options(::Base.JLOptions) at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?
   [26] _start() at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib:?

signal (11): Segmentation fault: 11
while loading /Users/davidbuchaca1/.julia/v0.6/CLArrays/test/runtests.jl, in expression starting on line 4
uv_async_send at /Users/osx/buildbot/slave/package_osx64/build/deps/srccache/libuv-d8ab1c6a33e77bf155facb54215dd8798e13825d/src/unix/async.c:62
ctx_notify_err at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/context.jl:100
unknown function (ip: 0x11b478257)
gclRegisterBlockKernelMap at /System/Library/Frameworks/OpenCL.framework/Versions/A/OpenCL (unknown line)
gclRegisterBlockKernelMap at /System/Library/Frameworks/OpenCL.framework/Versions/A/OpenCL (unknown line)
clBuildProgram at /System/Library/Frameworks/OpenCL.framework/Versions/A/OpenCL (unknown line)
#build!#113 at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/program.jl:90
unknown function (ip: 0x1265298e1)
#build! at ./<missing>:0
unknown function (ip: 0x1265291a6)
#19 at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:257
get! at ./dict.jl:449
unknown function (ip: 0x126cbb4ad)
Type at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:249
unknown function (ip: 0x126cbb04a)
_gpu_call at /Users/davidbuchaca1/.julia/v0.6/CLArrays/src/compilation.jl:15
unknown function (ip: 0x126cbac82)
gpu_call at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/abstract_gpu_interface.jl:151
unknown function (ip: 0x126cbaa65)
repmat at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/base.jl:106
#102 at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/base.jl:138
unknown function (ip: 0x126cba622)
jl_apply at /Users/osx/buildbot/slave/package_osx64/build/src/./julia.h:1424 [inlined]
jl_f__apply at /Users/osx/buildbot/slave/package_osx64/build/src/builtins.c:426
against_base at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/testsuite.jl:28
macro expansion at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/base.jl:138 [inlined]
macro expansion at ./test.jl:860 [inlined]
macro expansion at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/base.jl:137 [inlined]
macro expansion at ./test.jl:860 [inlined]
run_base at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/base.jl:34
unknown function (ip: 0x1265718c2)
run_tests at /Users/davidbuchaca1/.julia/v0.6/GPUArrays/src/testsuite/testsuite.jl:56
unknown function (ip: 0x11b47d272)
macro expansion at /Users/davidbuchaca1/.julia/v0.6/CLArrays/test/runtests.jl:8 [inlined]
macro expansion at ./test.jl:860 [inlined]
macro expansion at /Users/davidbuchaca1/.julia/v0.6/CLArrays/test/runtests.jl:6 [inlined]
anonymous at ./<missing> (unknown line)
jl_call_fptr_internal at /Users/osx/buildbot/slave/package_osx64/build/src/./julia_internal.h:339 [inlined]
jl_call_method_internal at /Users/osx/buildbot/slave/package_osx64/build/src/./julia_internal.h:358 [inlined]
jl_toplevel_eval_flex at /Users/osx/buildbot/slave/package_osx64/build/src/toplevel.c:589
jl_parse_eval_all at /Users/osx/buildbot/slave/package_osx64/build/src/ast.c:873
jl_load at /Users/osx/buildbot/slave/package_osx64/build/src/toplevel.c:616 [inlined]
jl_load_ at /Users/osx/buildbot/slave/package_osx64/build/src/toplevel.c:623
include_from_node1 at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib (unknown line)
jlcall_include_from_node1_18721 at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib (unknown line)
include at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib (unknown line)
jlcall_include_1022 at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib (unknown line)
process_options at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib (unknown line)
_start at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib (unknown line)
jlcall__start_18960 at /Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib (unknown line)
true_main at /Applications/Julia-0.6.app/Contents/Resources/julia/bin/julia (unknown line)
main at /Applications/Julia-0.6.app/Contents/Resources/julia/bin/julia (unknown line)
Allocations: 105730610 (Pool: 105718792; Big: 11818); GC: 240
=====================================================================[ ERROR: CLArrays ]======================================================================

failed process: Process(`/Applications/Julia-0.6.app/Contents/Resources/julia/bin/julia -Cnative -J/Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib --compile=yes --depwarn=yes --check-bounds=yes --code-coverage=none --color=yes --compilecache=yes /Users/davidbuchaca1/.julia/v0.6/CLArrays/test/runtests.jl`, ProcessSignaled(11)) [0]

==============================================================================================================================================================
ERROR: CLArrays had test errors

davidbp avatar Apr 12 '18 19:04 davidbp

I have also tested OpenCL and the tests are not passed:

julia> Pkg.test("OpenCL")
INFO: Testing OpenCL
Test Summary: | Pass  Total
layout        |    2      2
Test Summary:   | Pass  Total
OpenCL.Platform |   13     13
Test Summary:  | Pass  Total
OpenCL.Context |   41     41
Test Summary: | Pass  Total
OpenCL.Device |   81     81
WARNING: Platform Apple does not seem to suport out of order queues: 
CLError(code=-30, CL_INVALID_VALUE)
Test Summary:   | Pass  ERROR (unhandled task failure): OpenCL Error: OpenCL.Context error: 

Stacktrace:
 [1] raise_context_error(::String, ::String) at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/context.jl:109
 [2] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/context.jl:148 [inlined]
 [3] (::OpenCL.cl.##43#44)() at ./task.jl:335
Total
OpenCL.CmdQueue |   41     41
Test Summary: | Pass  Total
OpenCL.Minver |   15     15
Test Callback
Test Callback
Test Summary: | Pass  Total
OpenCL.Event  |   32     32
OpenCL.Program binaries: Test Failed
  Expression: prg2[:binaries] == binaries
   Evaluated: Dict{OpenCL.cl.Device,Array{UInt8,N} where N}() == Dict{OpenCL.cl.Device,Array{UInt8,N} where N}(Pair{OpenCL.cl.Device,Array{UInt8,N} where N}(OpenCL.Device(Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz on Apple @0x00000000ffffffff), UInt8[0x62, 0x70, 0x6c, 0x69, 0x73, 0x74, 0x30, 0x30, 0xd4, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x5f, 0x10, 0x0f, 0x63, 0x6c, 0x42, 0x69, 0x6e, 0x61, 0x72, 0x79, 0x56, 0x65, 0x72, 0x73, 0x69, 0x6f, 0x6e, 0x5c, 0x63, 0x6c, 0x42, 0x69, 0x6e, 0x61, 0x72, 0x79, 0x44, 0x61, 0x74, 0x61, 0x5f, 0x10, 0x11, 0x63, 0x6c, 0x50, 0x6c, 0x61, 0x74, 0x66, 0x6f, 0x72, 0x6d, 0x56, 0x65, 0x72, 0x73, 0x69, 0x6f, 0x6e, 0x5e, 0x63, 0x6c, 0x42, 0x69, 0x6e, 0x61, 0x72, 0x79, 0x44, 0x72, 0x69, 0x76, 0x65, 0x72, 0x11, 0x01, 0x02, 0x4f, 0x11, 0x10, 0x6c, 0xcf, 0xfa, 0xed, 0xfe, 0x07, 0x00, 0x00, 0x01, 0x03, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 0x30, 0x02, 0x00, 0x00, 0x85, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x19, 0x00, 0x00, 0x00, 0x38, 0x01, 0x00, 0x00, 0x5f, 0x5f, 0x54, 0x45, 0x58, 0x54, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, ... many 0x00... 0x00, 0x44, 0x00, 0x53, 0x00, 0x56, 0x10, 0xc6, 0x10, 0xea, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x09, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x10, 0xee]))
Stacktrace:
 [1] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/test/test_program.jl:86 [inlined]
 [2] macro expansion at ./test.jl:860 [inlined]
 [3] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/test/test_program.jl:75 [inlined]
 [4] macro expansion at ./test.jl:860 [inlined]
 [5] anonymous at ./<missing>:?
OpenCL.Program binaries: Test Failed
  Expression: prg2[:binaries] == binaries
   Evaluated: Dict{OpenCL.cl.Device,Array{UInt8,N} where N}() == Dict{OpenCL.cl.Device,Array{UInt8,N} where N}(Pair{OpenCL.cl.Device,Array{UInt8,N} where N}(OpenCL.Device(AMD Radeon Pro 580 Compute Engine on Apple @0x0000000001021c00), UInt8[0x62, 0x70, 0x6c, 0x69, 0x73, 0x74, 0x30, 0x30, 0xd4, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x5f, 0x10, 0x0f, 0x63, 0x6c, 0x42, 0x69, 0x6e, 0x61, 0x72, 0x79, 0x56, 0x65, 0x72, 0x73, 0x69, 0x6f, 0x6e, 0x5c, 0x63, 0x6c, 0x42, 0x69, 0x6e, 0x61, 0x72, 0x79, 0x44, 0x61, 0x74, 0x61, 0x5f, 0x10, 0x11, 0x63, 0x6c, 0x50, 0x6c, 0x61, 0x74, 0x66, 0x6f, 0x72, 0x6d, 0x56, 0x65, 0x72, 0x73, 0x69, 0x6f, 0x6e, 0x5e, 0x63, 0x6c, 0x42, 0x69, 0x6e, 0x61, 0x72, 0x79, 0x44, 0x72, 0x69, 0x76, 0x65, 0x72, 0x10, 0x00, 0x4f, 0x11, 0x35, 0x5c, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,  .... may 0x00 ,.... 0xf6]))
Stacktrace:
 [1] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/test/test_program.jl:86 [inlined]
 [2] macro expansion at ./test.jl:860 [inlined]
 [3] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/test/test_program.jl:75 [inlined]
 [4] macro expansion at ./test.jl:860 [inlined]
 [5] anonymous at ./<missing>:?
Test Summary:                       | Pass  Fail  Total
OpenCL.Program                      |   42     2     44
  OpenCL.Program source constructor |    2            2
  OpenCL.Program info               |   16           16
  OpenCL.Program build              |    8            8
  OpenCL.Program source code        |    2            2
  OpenCL.Program binaries           |   14     2     16
ERROR: LoadError: LoadError: Some tests did not pass: 42 passed, 2 failed, 0 errored, 0 broken.
while loading /Users/davidbuchaca1/.julia/v0.6/OpenCL/test/test_program.jl, in expression starting on line 1
while loading /Users/davidbuchaca1/.julia/v0.6/OpenCL/test/runtests.jl, in expression starting on line 30
======================================================================[ ERROR: OpenCL ]=======================================================================

failed process: Process(`/Applications/Julia-0.6.app/Contents/Resources/julia/bin/julia -Cnative -J/Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib --compile=yes --depwarn=yes --check-bounds=yes --code-coverage=none --color=yes --compilecache=yes /Users/davidbuchaca1/.julia/v0.6/OpenCL/test/runtests.jl`, ProcessExited(1)) [1]

==============================================================================================================================================================

WARNING: unknown IssueReporter commit 5f686e74, metadata may be ahead of package cache
INFO: No packages to install, update or remove
ERROR: OpenCL had test errors

davidbp avatar Apr 12 '18 21:04 davidbp

@SimonDanisch any Idea of what is going on? Which operating system might be more suitable for OpenCL.jl and CLArrays? I am thinking on installing ubuntu/Elementary os in my mac as a dual boot option until OSX support caches up with the rest.

davidbp avatar Apr 16 '18 10:04 davidbp

This might actually be rather a problem with AMD. The test failures of OpenCL.jl don't look that grave. Can you try running only the GPU tests?



using CLArrays, CLArrays.Shorthands
using GPUArrays.TestSuite, Base.Test

for dev in CLArrays.devices(is_gpu)
    CLArrays.init(dev)
    @testset "Device: $dev" begin

        TestSuite.run_tests(CLArray)

        @testset "muladd & abs" begin
            a = rand(Float32, 32) - 0.5f0
            A = CLArray(a)
            x = abs.(A)
            @test Array(x) == abs.(a)
            y = muladd.(A, 2f0, x)
            @test Array(y) == muladd(a, 2f0, abs.(a))
            ###########
            # issue #20
            against_base(a-> abs.(a), CLArray{Float32}, (10, 10))

            #### bools in kernel:
        end
        @testset "bools" begin
            A, B = rand(Bool, 10), rand(Bool, 10)
            Ag, Bg = CLArray(A), CLArray(B)
            res = A .& B
            resg = Ag .& Bg
            @test res == Array(resg)
            # this version needs to have a fix in GPUArrays, since it uses T.(array)
            # in copy to convert to array type, but that actually convert Array{Bool} to BitArray
            # against_base((a, b)-> a .& b, CLArray{Bool}, (10,), (10,))
        end

        @testset "Shorthand Test" begin
            GPUArrays.allowslow(true)
            @test collect(cl([1,2])) == [1,2]
            @test collect(cl([1 2;3 4])) == [1 2;3 4]
            @test cl([1,2,3]) == CLArray([1,2,3])
        end
    end
end

SimonDanisch avatar Apr 16 '18 14:04 SimonDanisch

In other words, switching OS might not help you that much. I had huge problems with GPU support on Ubuntu, especially with AMD. Windows is the only rock solid platform for GPUs... But ironically, a lot of frameworks, including our CUDAnative.jl, don't support Windows that well (CUDAnative needs a julia source build on 0.6 - 0.7 actually works fine as well). The good news: this is likely just a small mix up in the driver and we should be able to figure it out. We just need to isolate what's actually going wrong!

SimonDanisch avatar Apr 16 '18 14:04 SimonDanisch

Thank you for your help!

The output of the code you posted is this:

Test Summary:                                                                         | Pass  TotalERROR (unhandled task failure): OpenCL Error: OpenCL.Context error: 
Stacktrace:
 [1] raise_context_error(::String, ::String) at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/context.jl:109
 [2] macro expansion at /Users/davidbuchaca1/.julia/v0.6/OpenCL/src/context.jl:148 [inlined]
 [3] (::OpenCL.cl.##43#44)() at ./task.jl:335

Device: OpenCL.Device(AMD Radeon Pro 580 Compute Engine on Apple @0x0000000001021c00) |  745    745

davidbp avatar Apr 16 '18 17:04 davidbp

745 745 I guess that's the tests that pass? If true, that means most of the tests do actually pass :)

I should really try to fix this OpenCL task error to finally get the real error message.. I've tried that a couple of times, but since I was never able to reproduce, I never fixed it a 100%.

Maybe it's really just some problems with the launch parameters being too big. Memory shouldn't be a problem, since you seem to have 8gb VRAM.

SimonDanisch avatar Apr 16 '18 19:04 SimonDanisch

Is there any way to see the available memory at a given time? I can confirm that the GPU has 8 GB of ram. Let me know if you want me to do more tests. I really want CLArrays to be pushed further :) and have more users.

davidbp avatar Apr 16 '18 19:04 davidbp

Thank you! Can you check, that Pkg.status("OpenCL") is at least v0.6.1? I think that was my last attempt at fixing that task error.

I really don't think it's a memory problem! I tried to find a way to generically query the free memory, but for opencl it seems only available via direct calls to the vendors driver - so kind of complicated :( That's why this print out says NaN:

julia> CLArrays.init(CLArrays.devices()[1])
OpenCL context with:
CL version: OpenCL 1.2 
Device: CL AMD Radeon Pro 580 Compute Engine
            threads: 256
             blocks: (256, 256, 256)
      global_memory: 8589.934592 mb
 free_global_memory: NaN mb
       local_memory: 0.032768 mb

With NVIDIA gpu's I just call into the CUDA driver, which at least has a nice api for this, and it will show you the free memory.

SimonDanisch avatar Apr 16 '18 19:04 SimonDanisch

It must be something else, I have 0.7

 Pkg.status("OpenCL")
WARNING: unknown IssueReporter commit 5f686e74, metadata may be ahead of package cache
 - OpenCL                        0.7.0

About AMD drivers we can always write an issue in some of the open source repos from https://github.com/RadeonOpenCompute to get some help.

davidbp avatar Apr 16 '18 20:04 davidbp

Can you try this on the branch: Pkg.checkout("OpenCL", "sd/taskerror")

using CLArrays
CLArrays.init(CLArrays.devices(CLArrays.is_gpu)[1])
A = CLArray(rand(Float32, 1000, 1000))
sum(A)

SimonDanisch avatar Apr 16 '18 20:04 SimonDanisch

It seems Pkg.checkout("OpenCL", "sd/taskerror") is not working:

julia> Pkg.checkout("OpenCL", "sd/taskerror")
INFO: Checking out OpenCL sd/taskerror...
ERROR: GitError(Code:ERROR, Class:Merge, There is no tracking information for the current branch.)
Stacktrace:
 [1] (::Base.LibGit2.##117#125{Base.LibGit2.GitRepo})(::Base.LibGit2.GitReference) at ./libgit2/libgit2.jl:709
 [2] with(::Base.LibGit2.##117#125{Base.LibGit2.GitRepo}, ::Base.LibGit2.GitReference) at ./libgit2/types.jl:608
 [3] #merge!#109(::String, ::String, ::Bool, ::Base.LibGit2.MergeOptions, ::Base.LibGit2.CheckoutOptions, ::Function, ::Base.LibGit2.GitRepo) at ./libgit2/libgit2.jl:706
 [4] (::Base.#kw##merge!)(::Array{Any,1}, ::Base.#merge!, ::Base.LibGit2.GitRepo) at ./<missing>:0
 [5] (::Base.Pkg.Entry.##16#18{String,String,Bool,Bool})(::Base.LibGit2.GitRepo) at ./pkg/entry.jl:230
 [6] transact(::Base.Pkg.Entry.##16#18{String,String,Bool,Bool}, ::Base.LibGit2.GitRepo) at ./libgit2/libgit2.jl:882
 [7] with(::Base.Pkg.Entry.##15#17{String,String,Bool,Bool}, ::Base.LibGit2.GitRepo) at ./libgit2/types.jl:608
 [8] checkout(::String, ::String, ::Bool, ::Bool) at ./pkg/entry.jl:226
 [9] (::Base.Pkg.Dir.##4#7{Array{Any,1},Base.Pkg.Entry.#checkout,Tuple{String,String,Bool,Bool}})() at ./pkg/dir.jl:36
 [10] cd(::Base.Pkg.Dir.##4#7{Array{Any,1},Base.Pkg.Entry.#checkout,Tuple{String,String,Bool,Bool}}, ::String) at ./file.jl:70
 [11] #cd#1(::Array{Any,1}, ::Function, ::Function, ::String, ::Vararg{Any,N} where N) at ./pkg/dir.jl:36
 [12] #checkout#1(::Bool, ::Bool, ::Function, ::String, ::String) at ./pkg/pkg.jl:188
 [13] checkout(::String, ::String) at ./pkg/pkg.jl:188

davidbp avatar Apr 17 '18 06:04 davidbp

can you use git to check it out manually?

SimonDanisch avatar Apr 17 '18 13:04 SimonDanisch

I went to the OpenCL folder did a git fetch and then Pkg.checkout("OpenCL", "sd/taskerror")worked

iMac-de-David:OpenCL davidbuchaca1$ git fetch
remote: Counting objects: 4, done.
remote: Total 4 (delta 3), reused 3 (delta 3), pack-reused 1
Unpacking objects: 100% (4/4), done.
From https://github.com/JuliaGPU/OpenCL.jl
 * [new branch]      sd/taskerror -> origin/sd/taskerror

Nevertheless the sum gave me this:

julia> sum(A)
OpenCL Error: | OpenCL Build Warning : Compiler build log:
<program source>:38:39: warning: no previous prototype for function 'x7DeviceArray_float_1___global1float1218_1'
DeviceArray_float_1___global1float121 x7DeviceArray_float_1___global1float1218_1(uint size, __global float *  ptr)
                                      ^
<program source>:47:39: warning: no previous prototype for function 'reconstruct_2'
DeviceArray_float_1___global1float121 reconstruct_2(DeviceArray_float_1_HostPtr_float x, __global float *  ptr)
                                      ^
<program source>:63:6: warning: no previous prototype for function 'linear_index_3'
uint linear_index_3(KernelState state)
     ^
<program source>:95:13: warning: no previous prototype for function 'argtail_4'
EmptyTuple_ argtail_4(uint x, EmptyTuple_ rest)
            ^
<program source>:102:13: warning: no previous prototype for function 'tail_5'
EmptyTuple_ tail_5(uint x)
            ^
<program source>:111:6: warning: no previous prototype for function '_sub2ind_6'
uint _sub2ind_6(EmptyTuple_ x, uint L, uint ind)
     ^
<program source>:118:6: warning: no previous prototype for function '_sub2ind_7'
uint _sub2ind_7(uint inds, uint L, uint ind, uint i, EmptyTuple_ I)
     ^
<program source>:129:6: warning: no previous prototype for function 'gpu_sub2ind_8'
uint gpu_sub2ind_8(uint dims, uint I)
     ^
<program source>:141:6: warning: no previous prototype for function 'size_9'
uint size_9(DeviceArray_float_1___global1float121 x)
     ^
<program source>:154:6: warning: no previous prototype for function 'map_10'
uint map_10(Type5UInt326 f, uint t)
     ^
<program source>:161:6: warning: no previous prototype for function 'broadcast_10'
uint broadcast_10(Type5UInt326 f, uint t, EmptyTuple_ ts)
     ^
<program source>:170:6: warning: no previous prototype for function 'setindex9_11'
void setindex9_11(DeviceArray_float_1___global1float121 x, float val, uint i)
     ^

 |
500247OpenCL Error: | OpenCL Build Warning : Compiler build log:
<program source>:56:39: warning: no previous prototype for function 'x7DeviceArray_float_2___global1float1218_13'
DeviceArray_float_2___global1float121 x7DeviceArray_float_2___global1float1218_13(uint2 size, __global float *  ptr)
                                      ^
<program source>:65:39: warning: no previous prototype for function 'reconstruct_14'
DeviceArray_float_2___global1float121 reconstruct_14(DeviceArray_float_2_HostPtr_float x, __global float *  ptr)
                                      ^
<program source>:79:39: warning: no previous prototype for function 'x7DeviceArray_float_1___global1float1218_1'
DeviceArray_float_1___global1float121 x7DeviceArray_float_1___global1float1218_1(uint size, __global float *  ptr)
                                      ^
<program source>:86:39: warning: no previous prototype for function 'reconstruct_2'
DeviceArray_float_1___global1float121 reconstruct_2(DeviceArray_float_1_HostPtr_float x, __global float *  ptr)
                                      ^
<program source>:97:38: warning: no previous prototype for function 'x7DeviceArray_float_1___local1float1218_15'
DeviceArray_float_1___local1float121 x7DeviceArray_float_1___local1float1218_15(long size, __local float *  ptr)
                                     ^
<program source>:104:38: warning: no previous prototype for function 'x7GPUArrays3AbstractDeviceArray8_16'
DeviceArray_float_1___local1float121 x7GPUArrays3AbstractDeviceArray8_16(__local float *  ptr, long shape)
                                     ^
<program source>:120:6: warning: no previous prototype for function 'linear_index_3'
uint linear_index_3(KernelState state)
     ^
<program source>:133:6: warning: no previous prototype for function 'prod_17'
uint prod_17(uint2 x)
     ^
<program source>:145:7: warning: no previous prototype for function 'size_18'
uint2 size_18(DeviceArray_float_2___global1float121 x)
      ^
<program source>:152:6: warning: no previous prototype for function 'length_18'
uint length_18(DeviceArray_float_2___global1float121 t)
     ^
<program source>:162:7: warning: no previous prototype for function 'identity_19'
float identity_19(float x)
      ^
<program source>:172:7: warning: no previous prototype for function 'getindex_20'
float getindex_20(DeviceArray_float_2___global1float121 x, uint ilin)
      ^
<program source>:182:6: warning: no previous prototype for function 'global_size_3'
uint global_size_3(KernelState state)
     ^
<program source>:192:6: warning: no previous prototype for function 'threadidx_x_3'
uint threadidx_x_3(KernelState x4unused4)
     ^
<program source>:220:13: warning: no previous prototype for function 'argtail_4'
EmptyTuple_ argtail_4(uint x, EmptyTuple_ rest)
            ^
<program source>:227:13: warning: no previous prototype for function 'tail_5'
EmptyTuple_ tail_5(uint x)
            ^
<program source>:236:6: warning: no previous prototype for function '_sub2ind_6'
uint _sub2ind_6(EmptyTuple_ x, uint L, uint ind)
     ^
<program source>:243:6: warning: no previous prototype for function '_sub2ind_7'
uint _sub2ind_7(uint inds, uint L, uint ind, uint i, EmptyTuple_ I)
     ^
<program source>:254:6: warning: no previous prototype for function 'gpu_sub2ind_8'
uint gpu_sub2ind_8(uint dims, uint I)
     ^
<program source>:263:6: warning: no previous prototype for function 'size_21'
uint size_21(DeviceArray_float_1___local1float121 x)
     ^
<program source>:276:6: warning: no previous prototype for function 'map_10'
uint map_10(Type5UInt326 f, uint t)
     ^
<program source>:283:6: warning: no previous prototype for function 'broadcast_10'
uint broadcast_10(Type5UInt326 f, uint t, EmptyTuple_ ts)
     ^
<program source>:292:6: warning: no previous prototype for function 'setindex9_22'
void setindex9_22(DeviceArray_float_1___local1float121 x, float val, uint i)
     ^
<program source>:307:6: warning: no previous prototype for function 'synchronize_threads_3'
void synchronize_threads_3(KernelState x4unused4)
     ^
<program source>:317:6: warning: no previous prototype for function 'blockdim_x_3'
uint blockdim_x_3(KernelState x4unused4)
     ^
<program source>:324:7: warning: no previous prototype for function 'getindex_23'
float getindex_23(DeviceArray_float_1___local1float121 x, uint i)
      ^
<program source>:333:6: warning: no previous prototype for function 'size_9'
uint size_9(DeviceArray_float_1___global1float121 x)
     ^
<program source>:340:6: warning: no previous prototype for function 'setindex9_11'
void setindex9_11(DeviceArray_float_1___global1float121 x, float val, uint i)
     ^
<program source>:355:6: warning: no previous prototype for function 'blockidx_x_3'
uint blockidx_x_3(KernelState x4unused4)
     ^

 |
.38f0

davidbp avatar Apr 17 '18 19:04 davidbp

So does it actually work now? I can only see warnings...

SimonDanisch avatar Apr 17 '18 19:04 SimonDanisch

After the warnings once I call again the function it works.

julia> A_cpu = Array(A)
julia> sum(A_cpu)
499799.3f0
julia> sum(A)
499799.28f0

davidbp avatar Apr 17 '18 20:04 davidbp

This does not make Pkg.test("CLArrays") (for the CPU) to properly work. It prints a huge list of stuff and finishes with:

Test Summary:                                                                                | Pass  Error  Total
Device: OpenCL.Device(Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz on Apple @0x00000000ffffffff) |  371    163    534
  parallel execution interface                                                               |    7             7
  base functionality                                                                         |    9      5     14
    mapidx                                                                                   |           1      1
    copy!                                                                                    |           1      1
    vcat + hcat                                                                              |    4             4
    reinterpret                                                                              |    2             2
    ntuple test                                                                              |    2             2
    cartesian iteration                                                                      |           1      1
    Custom kernel from Julia function                                                        |    1             1
    map                                                                                      |           1      1
    repmat                                                                                   |           1      1
  BLAS                                                                                       |    2      1      3
    matmul                                                                                   |           1      1
    scale! Complex{Float32}                                                                  |    1             1
    scale! Float32                                                                           |    1             1
  broadcast                                                                                  |   12     19     31
    broadcast Float32                                                                        |    2      3      5
      RefValue                                                                               |           1      1
      Tuple                                                                                  |           1      1
    broadcast Float64                                                                        |    2      3      5
      RefValue                                                                               |           1      1
      Tuple                                                                                  |           1      1
    broadcast Int32                                                                          |    2      3      5
      RefValue                                                                               |           1      1
      Tuple                                                                                  |           1      1
    broadcast Int64                                                                          |    2      3      5
      RefValue                                                                               |           1      1
      Tuple                                                                                  |           1      1
    broadcast Complex{Float32}                                                               |    2      3      5
      RefValue                                                                               |           1      1
      Tuple                                                                                  |           1      1
    broadcast Complex{Float64}                                                               |    2      3      5
      RefValue                                                                               |           1      1
      Tuple                                                                                  |           1      1
    vec 3                                                                                    |           1      1
  Construction                                                                               |  204      4    208
    similar + constructor                                                                    |  132           132
    conversion                                                                               |   72            72
    value constructor                                                                        |           1      1
    iterator constructors                                                                    |           3      3
  FFT with ND = 1                                                                            |           1      1
  FFT with ND = 2                                                                            |           1      1
  FFT with ND = 3                                                                            |           1      1
  Linalg                                                                                     |           2      2
    transpose                                                                                |           1      1
    PermuteDims                                                                              |           1      1
  mapreduce                                                                                  |   49    124    173
    Float32                                                                                  |    8     22     30
      mapreducedim                                                                           |          14     14
      sum maximum minimum prod                                                               |    8      8     16
    Float64                                                                                  |    8     22     30
      mapreducedim                                                                           |          14     14
      sum maximum minimum prod                                                               |    8      8     16
    Int32                                                                                    |    8     22     30
      mapreducedim                                                                           |          14     14
      sum maximum minimum prod                                                               |    8      8     16
    Int64                                                                                    |    8     22     30
      mapreducedim                                                                           |          14     14
      sum maximum minimum prod                                                               |    8      8     16
    Complex{Float32}                                                                         |    4     18     22
      mapreducedim                                                                           |          14     14
      sum maximum minimum prod                                                               |    4      4      8
    Complex{Float64}                                                                         |    4     18     22
      mapreducedim                                                                           |          14     14
      sum maximum minimum prod                                                               |    4      4      8
    any all                                                                                  |    9             9
  indexing                                                                                   |   80            80
  Random                                                                                     |    5      3      8
    rand                                                                                     |    5      3      8
  muladd & abs                                                                               |           1      1
  bools                                                                                      |           1      1
  Shorthand Test                                                                             |    3             3
ERROR: LoadError: Some tests did not pass: 371 passed, 0 failed, 163 errored, 0 broken.
while loading /Users/davidbuchaca1/.julia/v0.6/CLArrays/test/runtests.jl, in expression starting on line 4
=====================================================================[ ERROR: CLArrays ]======================================================================

failed process: Process(`/Applications/Julia-0.6.app/Contents/Resources/julia/bin/julia -Cnative -J/Applications/Julia-0.6.app/Contents/Resources/julia/lib/julia/sys.dylib --compile=yes --depwarn=yes --check-bounds=yes --code-coverage=none --color=yes --compilecache=yes /Users/davidbuchaca1/.julia/v0.6/CLArrays/test/runtests.jl`, ProcessExited(1)) [1]

==============================================================================================================================================================
ERROR: CLArrays had test errors

I guess it is expected since I am in a test branch of OpenCL.jl. I hope that you can get a hint on what to update on the package now. Let me know once you think this is solved to checkout to the master branch and update OpenCL.jl form there.

Sorry... Seems all errors are related to the CPU!

Maybe a having an argument to split the tests would be nice :) Pkg.test("CLArrays"; device_test= "GPU")

davidbp avatar Apr 17 '18 20:04 davidbp

OpenCL.Device(Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz

But that's the CPU OpenCL implementation, right?

SimonDanisch avatar Apr 17 '18 20:04 SimonDanisch

yes yes I updated my comment :)

davidbp avatar Apr 17 '18 20:04 davidbp

Maybe I should just not test the CPU anymore - at least as long as I don't have time to fix those issues and I know that half of the package is broken ;)

SimonDanisch avatar Apr 17 '18 20:04 SimonDanisch