OpenCL.jl icon indicating copy to clipboard operation
OpenCL.jl copied to clipboard

Segmentation fault in test/test_array on AMD CPU

Open vchuravy opened this issue 7 years ago • 18 comments

Noticed while preparing v0.5.1 release. I am unable to catch this in RR and it occurs both in julia v0.5 and v0.6.

julia -g 2 -e 'using Base.Test; using OpenCL; include(Pkg.dir("OpenCL", "test", "test_array.jl"))'

cc: @dfdx since it is happening in a piece of code that you contributed. Any ideas?

signal (11): Segmentation fault
while loading /home/wallnuss/.julia/v0.6/OpenCL/test/test_array.jl, in expression starting on line 3
clSetKernelArg at /opt/AMDAPP/SDK/lib/sdk/libamdocl64.so (unknown line)
clSetKernelArg at /home/wallnuss/.julia/v0.6/OpenCL/src/api.jl:17
macro expansion at /home/wallnuss/.julia/v0.6/OpenCL/src/macros.jl:4 [inlined]
set_arg! at /home/wallnuss/.julia/v0.6/OpenCL/src/kernel.jl:76
unknown function (ip: 0x7f46cd180a60)
jl_call_method_internal at /home/wallnuss/src/julia/src/julia_internal.h:248 [inlined]
jl_apply_generic at /home/wallnuss/src/julia/src/gf.c:2212
set_args! at /home/wallnuss/.julia/v0.6/OpenCL/src/kernel.jl:106
#transpose!#145 at /home/wallnuss/.julia/v0.6/OpenCL/src/array.jl:115
unknown function (ip: 0x7f46cd1d6ed2)
jl_call_method_internal at /home/wallnuss/src/julia/src/julia_internal.h:248 [inlined]
jl_invoke at /home/wallnuss/src/julia/src/gf.c:41
transpose! at /home/wallnuss/.julia/v0.6/OpenCL/src/array.jl:109
unknown function (ip: 0x7f46cd1d6ac6)
jl_call_method_internal at /home/wallnuss/src/julia/src/julia_internal.h:248 [inlined]
jl_invoke at /home/wallnuss/src/julia/src/gf.c:41
macro expansion at /home/wallnuss/.julia/v0.6/OpenCL/test/test_array.jl:53 [inlined]
macro expansion at ./test.jl:853 [inlined]
macro expansion at /home/wallnuss/.julia/v0.6/OpenCL/test/test_array.jl:0 [inlined]
macro expansion at ./test.jl:853 [inlined]
anonymous at ./<missing> (unknown line)
jl_call_method_internal at /home/wallnuss/src/julia/src/julia_internal.h:248 [inlined]
jl_toplevel_eval_flex at /home/wallnuss/src/julia/src/toplevel.c:589
jl_parse_eval_all at /home/wallnuss/src/julia/src/ast.c:847
jl_load at /home/wallnuss/src/julia/src/toplevel.c:616
include_from_node1 at ./loading.jl:539
unknown function (ip: 0x7f46cd10c012)
jl_call_method_internal at /home/wallnuss/src/julia/src/julia_internal.h:248 [inlined]
jl_apply_generic at /home/wallnuss/src/julia/src/gf.c:2212
include at ./sysimg.jl:14
unknown function (ip: 0x7f48e6b8a3bb)
jl_call_method_internal at /home/wallnuss/src/julia/src/julia_internal.h:248 [inlined]
jl_apply_generic at /home/wallnuss/src/julia/src/gf.c:2212
do_call at /home/wallnuss/src/julia/src/interpreter.c:75
eval at /home/wallnuss/src/julia/src/interpreter.c:230
jl_interpret_toplevel_expr at /home/wallnuss/src/julia/src/interpreter.c:34
jl_toplevel_eval_flex at /home/wallnuss/src/julia/src/toplevel.c:577
jl_eval_module_expr at /home/wallnuss/src/julia/src/toplevel.c:203
jl_toplevel_eval_flex at /home/wallnuss/src/julia/src/toplevel.c:480
jl_parse_eval_all at /home/wallnuss/src/julia/src/ast.c:847
jl_load at /home/wallnuss/src/julia/src/toplevel.c:616
include_from_node1 at ./loading.jl:539
unknown function (ip: 0x7f48e6cd1dcb)
jl_call_method_internal at /home/wallnuss/src/julia/src/julia_internal.h:248 [inlined]
jl_apply_generic at /home/wallnuss/src/julia/src/gf.c:2212
include at ./sysimg.jl:14
unknown function (ip: 0x7f48e6b8a3bb)
jl_call_method_internal at /home/wallnuss/src/julia/src/julia_internal.h:248 [inlined]
jl_apply_generic at /home/wallnuss/src/julia/src/gf.c:2212
process_options at ./client.jl:305
_start at ./client.jl:371
unknown function (ip: 0x7f48e6cfdd88)
jl_call_method_internal at /home/wallnuss/src/julia/src/julia_internal.h:248 [inlined]
jl_apply_generic at /home/wallnuss/src/julia/src/gf.c:2212
jl_apply at /home/wallnuss/src/julia/ui/../src/julia.h:1410 [inlined]
true_main at /home/wallnuss/src/julia/ui/repl.c:127
main at /home/wallnuss/src/julia/ui/repl.c:264
__libc_start_main at /usr/lib/libc.so.6 (unknown line)
unknown function (ip: 0x401519)
Allocations: 21279480 (Pool: 21276613; Big: 2867); GC: 48

This is the AMD SDK for CPUs v3.0

clinfo:

 Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 1
  Device Name                                     Intel(R) Xeon(R) CPU E5-2643 v3 @ 3.40GHz
  Device Vendor                                   GenuineIntel
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.2 AMD-APP (1800.8)
  Driver Version                                  1800.8 (sse2,avx)
  Device OpenCL C Version                         OpenCL C 1.2 
  Device Type                                     CPU
  Device Profile                                  FULL_PROFILE
  Device Board Name (AMD)                         
  Device Topology (AMD)                           (n/a)
  Max compute units                               12
  Max clock frequency                             1399MHz
  Device Partition                                (core, cl_ext_device_fission)
    Max number of sub-devices                     12
    Supported partition types                     equally, by counts, by affinity domain
    Supported affinity domains                    L3 cache, L2 cache, L1 cache, next partitionable
    Supported partition types (ext)               equally, by counts, by affinity domain
    Supported affinity domains (ext)              L3 cache, L2 cache, L1 cache, next fissionable
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x1024
  Max work group size                             1024
  Preferred work group size multiple              1
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 4 / 4        (n/a)
    float                                                8 / 8       
    double                                               4 / 4        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Address bits                                    64, Little-Endian
  Global memory size                              67485077504 (62.85GiB)
  Error Correction support                        No
  Max memory allocation                           16871269376 (15.71GiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        32768
  Global Memory cache line                        64 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max size for 1D images from buffer            65536 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             8192x8192 pixels
    Max 3D image size                             2048x2048x2048 pixels
    Max number of read image args                 128
    Max number of write image args                64
  Local memory type                               Global
  Local memory size                               32768 (32KiB)
  Max constant buffer size                        65536 (64KiB)
  Max number of constant args                     8
  Max size of kernel argument                     4096 (4KiB)
  Queue properties                                
    Out-of-order execution                        No
    Profiling                                     Yes
  Prefer user sync for interop                    Yes
  Profiling timer resolution                      1ns
  Profiling timer offset since Epoch (AMD)        1487579561779808809ns (Mon Feb 20 17:32:41 2017)
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            Yes
    SPIR versions                                 1.2
  printf() buffer size                            65536 (64KiB)
  Built-in kernels                                
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Device Extensions                               cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_spir cl_khr_gl_event 

vchuravy avatar Feb 20 '17 08:02 vchuravy

On Travis we are currently using AMD-APP-SDK-linux-v2.9-1.599.381-GA-x64.tar.bz2 If I remember correctly because of bugs like these...

vchuravy avatar Feb 20 '17 08:02 vchuravy

This might be related to this commented line - if I recall it correctly, I copied most of the code for transpose from some implementation in C where the trick with block_size + 1 came unexplained, so I just removed it. Could you please try to uncomment this line (and comment out the next one, of course) and try again?

dfdx avatar Feb 20 '17 10:02 dfdx

Just did and I am seeing the same behaviour. The seqfault happens when setting the first argument of the kernel https://github.com/JuliaGPU/OpenCL.jl/blob/409cb88e0921ffa5c12e79525fb16b2ab89025da/src/kernel.jl#L76, so for buffer(B). Does anybody have an AMD gpu with the newest driver? (@SimonDanisch maybe?)

vchuravy avatar Feb 20 '17 11:02 vchuravy

I will have in 2 days ;)

SimonDanisch avatar Feb 20 '17 15:02 SimonDanisch

@SimonDanisch did you manage to give this a go?

vchuravy avatar Mar 28 '17 03:03 vchuravy

I get this only with finalizers activated for Buffer... Uncommenting: https://github.com/JuliaGPU/OpenCL.jl/blob/47fbbdda994188875d2e2895aaf00402126ce02b/src/buffer.jl#L18 Removes the error! Maybe it's time to correctly define cconvert and friends!? ;)

SimonDanisch avatar Mar 28 '17 09:03 SimonDanisch

Since the same "fix" solves also https://github.com/JuliaGPU/CLBLAS.jl/pull/25

SimonDanisch avatar Mar 28 '17 09:03 SimonDanisch

Uh.... Not sure on what branches I was, but master actually passes all tests locally for me...

SimonDanisch avatar Mar 28 '17 09:03 SimonDanisch

Okay, I think the last remaining problem is, that this segfaults:

for i=1:N
    device, ctx, queue = cl.create_compute_context()
    # do stuff
    ...
end

The rest seems to work fine!

SimonDanisch avatar Mar 28 '17 11:03 SimonDanisch

These failures might be non-deterministic. Especially when finalizers are involved.

Maybe it's time to correctly define cconvert and friends!? ;)

Definitely! Want to give this a go? ;P

for i=1:N
    device, ctx, queue = cl.create_compute_context()
    # do stuff
    ...
end

The rest seems to work fine!

Yeah that definitely should not segfault...

vchuravy avatar Mar 29 '17 02:03 vchuravy

@SimonDanisch can you reproduce the seqfault and send a log/backtrace/rr trace along?

vchuravy avatar Apr 17 '17 01:04 vchuravy

Figured it is better for me to comment here than open another issue. I am also getting segfaults every time, as well as a ReadOnlyMemoryError.

julia> versioninfo()
Pkg.stJulia Version 0.6.0
Commit 903644385b (2017-06-19 13:05 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: AMD FX(tm)-6300 Six-Core Processor
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Piledriver)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, bdver1)

julia> Pkg.status("OpenCL")
 - OpenCL                        0.6.1              master

julia> using OpenCL

julia> cl.platforms()
ERROR: ReadOnlyMemoryError()
Stacktrace:
 [1] clGetPlatformIDs(::Int64, ::Ptr{Void}, ::Base.RefValue{UInt32}) at /home/chris/.julia/v0.6/OpenCL/src/api.jl:17
 [2] macro expansion at /home/chris/.julia/v0.6/OpenCL/src/macros.jl:4 [inlined]
 [3] platforms() at /home/chris/.julia/v0.6/OpenCL/src/platform.jl:21

julia> cl.platforms()
ERROR: CLError(code=-1001, CL_PLATFORM_NOT_FOUND_KHR)
Stacktrace:
 [1] macro expansion at /home/chris/.julia/v0.6/OpenCL/src/macros.jl:6 [inlined]
 [2] platforms() at /home/chris/.julia/v0.6/OpenCL/src/platform.jl:21

julia> exit()

signal (11): Segmentation fault
while loading no file, in expression starting on line 0
unknown function (ip: 0x7f77db4b4db3)
_ZN4llvm2cl19generic_parser_base10findOptionEPKc at /home/chris/Documents/programs/julia/usr/bin/../lib/libLLVM-3.9.so (unknown line)
unknown function (ip: 0x7f77d9e4dcfc)
_ZN4llvm19MachinePassRegistry6RemoveEPNS_23MachinePassRegistryNodeE at /usr/lib/x86_64-linux-gnu/libLLVM-5.0.so.1 (unknown line)
__run_exit_handlers at /build/glibc-mXZSwJ/glibc-2.24/stdlib/exit.c:83
exit at /build/glibc-mXZSwJ/glibc-2.24/stdlib/exit.c:105
jl_exit at /home/chris/Documents/programs/julia/src/jl_uv.c:552
exit at ./initdefs.jl:22
unknown function (ip: 0x7f77d5ca40f8)
jl_call_fptr_internal at /home/chris/Documents/programs/julia/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/chris/Documents/programs/julia/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/chris/Documents/programs/julia/src/gf.c:1933
do_call at /home/chris/Documents/programs/julia/src/interpreter.c:75
eval at /home/chris/Documents/programs/julia/src/interpreter.c:242
jl_interpret_toplevel_expr at /home/chris/Documents/programs/julia/src/interpreter.c:34
jl_toplevel_eval_flex at /home/chris/Documents/programs/julia/src/toplevel.c:577
jl_toplevel_eval_in at /home/chris/Documents/programs/julia/src/builtins.c:496
eval at ./boot.jl:235
unknown function (ip: 0x7f77d5b2bf3f)
jl_call_fptr_internal at /home/chris/Documents/programs/julia/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/chris/Documents/programs/julia/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/chris/Documents/programs/julia/src/gf.c:1933
eval_user_input at ./REPL.jl:66
unknown function (ip: 0x7f77d5b99caf)
jl_call_fptr_internal at /home/chris/Documents/programs/julia/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/chris/Documents/programs/julia/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/chris/Documents/programs/julia/src/gf.c:1933
macro expansion at ./REPL.jl:97 [inlined]
#1 at ./event.jl:73
unknown function (ip: 0x7f77bc9952bf)
jl_call_fptr_internal at /home/chris/Documents/programs/julia/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/chris/Documents/programs/julia/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/chris/Documents/programs/julia/src/gf.c:1933
jl_apply at /home/chris/Documents/programs/julia/src/julia.h:1424 [inlined]
start_task at /home/chris/Documents/programs/julia/src/task.c:267
unknown function (ip: 0xffffffffffffffff)
Allocations: 6082652 (Pool: 6081287; Big: 1365); GC: 12
Segmentation fault (core dumped)

I know this isn't totally an OpenCL.jl issue, because ArrayFire gives me the same error and segfault:

julia> Pkg.test("ArrayFire")
INFO: Computing test dependencies for ArrayFire...
INFO: No packages to install, update or remove
INFO: Testing ArrayFire
ERROR: LoadError: InitError: ReadOnlyMemoryError()
Stacktrace:
 [1] __init__() at /home/chris/.julia/v0.6/ArrayFire/src/util.jl:62
 [2] _include_from_serialized(::String) at ./loading.jl:157
 [3] _require_from_serialized(::Int64, ::Symbol, ::String, ::Bool) at ./loading.jl:200
 [4] _require(::Symbol) at ./loading.jl:491
 [5] require(::Symbol) at ./loading.jl:398
 [6] include_from_node1(::String) at ./loading.jl:569
 [7] include(::String) at ./sysimg.jl:14
 [8] process_options(::Base.JLOptions) at ./client.jl:305
 [9] _start() at ./client.jl:371
during initialization of module ArrayFire
while loading /home/chris/.julia/v0.6/ArrayFire/test/runtests.jl, in expression starting on line 2

signal (11): Segmentation fault
while loading no file, in expression starting on line 0
unknown function (ip: 0x7fb32f42cdb3)
_ZN4llvm2cl19generic_parser_base10findOptionEPKc at /home/chris/Documents/programs/julia/usr/bin/../lib/libLLVM-3.9.so (unknown line)
unknown function (ip: 0x7fb32ddc5cfc)
_ZN4llvm19MachinePassRegistry6RemoveEPNS_23MachinePassRegistryNodeE at /usr/lib/x86_64-linux-gnu/libLLVM-5.0.so.1 (unknown line)
__run_exit_handlers at /build/glibc-mXZSwJ/glibc-2.24/stdlib/exit.c:83
exit at /build/glibc-mXZSwJ/glibc-2.24/stdlib/exit.c:105
jl_exit at /home/chris/Documents/programs/julia/src/jl_uv.c:552
_start at ./client.jl:419
unknown function (ip: 0x7fb329ae6208)
jl_call_fptr_internal at /home/chris/Documents/programs/julia/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/chris/Documents/programs/julia/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/chris/Documents/programs/julia/src/gf.c:1933
jl_apply at /home/chris/Documents/programs/julia/ui/../src/julia.h:1424 [inlined]
true_main at /home/chris/Documents/programs/julia/ui/repl.c:127
main at /home/chris/Documents/programs/julia/ui/repl.c:264
__libc_start_main at /build/glibc-mXZSwJ/glibc-2.24/csu/../csu/libc-start.c:291
unknown function (ip: 0x561bd264f5e9)
Allocations: 1521654 (Pool: 1520517; Big: 1137); GC: 0
==========================================================================================[ ERROR: ArrayFire ]==========================================================================================

failed process: Process(`/home/chris/Documents/programs/julia/usr/bin/julia -Cnative -J/home/chris/Documents/programs/julia/usr/lib/julia/sys.so --compile=yes --depwarn=yes --check-bounds=yes --code-coverage=none --color=yes --compilecache=yes /home/chris/.julia/v0.6/ArrayFire/test/runtests.jl`, ProcessSignaled(11)) [0]

========================================================================================================================================================================================================
INFO: No packages to install, update or remove
ERROR: ArrayFire had test errors

Yet I am at a loss of where to go from here. When I run clinfo from the command line, it lists "Device OpenCL C Version" as OpenCL C 1.1, but the ICD Loader Profile is listed as OpenCL 2.1 -- is this an issue? Output of clinfo:

clinfo
Number of platforms                               1
  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 Mesa 17.3.0-devel
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA

  Platform Name                                   Clover
Number of devices                                 1
  Device Name                                     AMD Radeon(TM) HD8800 Series (PITCAIRN / DRM 3.9.0 / 4.10.0-35-generic, LLVM 5.0.0)
  Device Vendor                                   AMD
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.1 Mesa 17.3.0-devel
  Driver Version                                  17.3.0-devel
  Device OpenCL C Version                         OpenCL C 1.1 
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Max compute units                               20
  Max clock frequency                             975MHz
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple              64
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 8 / 8        (cl_khr_fp16)
    float                                                4 / 4       
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (cl_khr_fp16)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Address bits                                    64, Little-Endian
  Global memory size                              2146324480 (1.999GiB)
  Error Correction support                        No
  Max memory allocation                           1502427136 (1.399GiB)
  Unified memory for Host and Device              No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       32768 bits (4096 bytes)
  Global Memory cache type                        None
  Image support                                   No
  Local memory type                               Local
  Local memory size                               32768 (32KiB)
  Max constant buffer size                        1502427136 (1.399GiB)
  Max number of constant args                     16
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        No
    Profiling                                     Yes
  Profiling timer resolution                      0ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  Device Available                                Yes
  Compiler Available                              Yes
  Device Extensions                               cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64 cl_khr_fp16

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  Clover
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [MESA]
  clCreateContext(NULL, ...) [default]            Success [MESA]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 Clover
    Device Name                                   AMD Radeon(TM) HD8800 Series (PITCAIRN / DRM 3.9.0 / 4.10.0-35-generic, LLVM 5.0.0)
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 Clover
    Device Name                                   AMD Radeon(TM) HD8800 Series (PITCAIRN / DRM 3.9.0 / 4.10.0-35-generic, LLVM 5.0.0)

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.11
  ICD loader Profile                              OpenCL 2.1

I have some computational ideas that could be extremely parallelized, so I think they have a lot of potential on a GPU, but I'd like to experiment on the one I have:

$ lspci  -v -s  $(lspci | grep ' VGA ' | cut -d" " -f 1)
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Curacao PRO [Radeon R7 370 / R9 270/370 OEM] (prog-if 00 [VGA controller])
	Subsystem: Gigabyte Technology Co., Ltd Curacao PRO [Radeon R7 370 / R9 270/370 OEM]
	Flags: bus master, fast devsel, latency 0, IRQ 31
	Memory at c0000000 (64-bit, prefetchable) [size=256M]
	Memory at fea00000 (64-bit, non-prefetchable) [size=256K]
	I/O ports at e000 [size=256]
	Expansion ROM at 000c0000 [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: amdgpu
	Kernel modules: radeon, amdgpu

before shelling out a lot of money on a new high-performing one. (As a note on the kernel driver: I tried the radeon earlier, and that didn't work either.)

All this is well outside of my experience but if anyone has suggestions on where I can go or what I can read to learn about what's going on, let me know. But struggling as much as I have to get things to work here also makes me hesitant to invest in a new GPU (and on that note, I would prefer to support OpenCL over proprietary cuda). I'd also be willing to switch Linux distros. I'm on Ubuntu now, but Arch Linux for example can use AMD Catalyst.

EDIT: ArrayFire's tests all passed:

$cp -r /usr/local/share/ArrayFire/examples .
$cd examples
$mkdir build
$cd build
$cmake ..
$make
[  0%] Building CXX object CMakeFiles/example_basic_opencl.dir/unified/basic.cpp.o
[  1%] Linking CXX executable unified/basic_opencl
[  1%] Built target example_basic_opencl
[  1%] Building CXX object CMakeFiles/example_rbm_opencl.dir/machine_learning/rbm.cpp.o
---snip---
[ 99%] Built target example_deep_belief_net_unified
[100%] Building CXX object CMakeFiles/example_geneticalgorithm_unified.dir/machine_learning/geneticalgorithm.cpp.o
[100%] Linking CXX executable machine_learning/geneticalgorithm_unified
[100%] Built target example_geneticalgorithm_unified

(Not the Julia wrapper -- that one failed as above.)

I also get:

$ make
gcc   -std=gnu99 -ocl-demo cl-demo.c cl-helper.c -lrt -lOpenCL
cl-helper.c: In function ‘create_context_on’:
cl-helper.c:351:15: warning: ‘clCreateCommandQueue’ is deprecated [-Wdeprecated-declarations]
               *queue = clCreateCommandQueue(*ctx, dev, qprops, &status);
               ^
In file included from cl-helper.h:36:0,
                 from cl-helper.c:26:
/usr/include/CL/cl.h:1427:1: note: declared here
 clCreateCommandQueue(cl_context                     /* context */,
 ^~~~~~~~~~~~~~~~~~~~
gcc   -std=gnu99 -oprint-devices print-devices.c cl-helper.c -lrt -lOpenCL
cl-helper.c: In function ‘create_context_on’:
cl-helper.c:351:15: warning: ‘clCreateCommandQueue’ is deprecated [-Wdeprecated-declarations]
               *queue = clCreateCommandQueue(*ctx, dev, qprops, &status);
               ^
In file included from cl-helper.h:36:0,
                 from cl-helper.c:26:
/usr/include/CL/cl.h:1427:1: note: declared here
 clCreateCommandQueue(cl_context                     /* context */,
 ^~~~~~~~~~~~~~~~~~~~
$ ./print-devices
platform 0: vendor 'Mesa'
  device 0: 'AMD Radeon(TM) HD8800 Series (PITCAIRN / DRM 3.9.0 / 4.10.0-35-generic, LLVM 5.0.0)'
$ ./cl-demo 1000000 10
Choose platform:
[0] Mesa
Enter choice: 0
Choose device:
[0] AMD Radeon(TM) HD8800 Series (PITCAIRN / DRM 3.9.0 / 4.10.0-35-generic, LLVM 5.0.0)
Enter choice: 0
---------------------------------------------------------------------
NAME: AMD Radeon(TM) HD8800 Series (PITCAIRN / DRM 3.9.0 / 4.10.0-35-generic, LLVM 5.0.0)
VENDOR: AMD
PROFILE: FULL_PROFILE
VERSION: OpenCL 1.1 Mesa 17.3.0-devel
EXTENSIONS: cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64 cl_khr_fp16
DRIVER_VERSION: 17.3.0-devel

Type: GPU 
EXECUTION_CAPABILITIES: Kernel 
GLOBAL_MEM_CACHE_TYPE: None (0)
CL_DEVICE_LOCAL_MEM_TYPE: Local (1)
SINGLE_FP_CONFIG: 0x6
QUEUE_PROPERTIES: 0x2

VENDOR_ID: 4098
MAX_COMPUTE_UNITS: 20
MAX_WORK_ITEM_DIMENSIONS: 3
MAX_WORK_GROUP_SIZE: 256
PREFERRED_VECTOR_WIDTH_CHAR: 16
PREFERRED_VECTOR_WIDTH_SHORT: 8
PREFERRED_VECTOR_WIDTH_INT: 4
PREFERRED_VECTOR_WIDTH_LONG: 2
PREFERRED_VECTOR_WIDTH_FLOAT: 4
PREFERRED_VECTOR_WIDTH_DOUBLE: 2
MAX_CLOCK_FREQUENCY: 975
ADDRESS_BITS: 64
MAX_MEM_ALLOC_SIZE: 1502427136
IMAGE_SUPPORT: 0
MAX_READ_IMAGE_ARGS: 32
MAX_WRITE_IMAGE_ARGS: 32
IMAGE2D_MAX_WIDTH: 32768
IMAGE2D_MAX_HEIGHT: 32768
IMAGE3D_MAX_WIDTH: 4096
IMAGE3D_MAX_HEIGHT: 4096
IMAGE3D_MAX_DEPTH: 4096
MAX_SAMPLERS: 32
MAX_PARAMETER_SIZE: 1024
MEM_BASE_ADDR_ALIGN: 32768
MIN_DATA_TYPE_ALIGN_SIZE: 128
GLOBAL_MEM_CACHELINE_SIZE: 0
GLOBAL_MEM_CACHE_SIZE: 0
GLOBAL_MEM_SIZE: 2146324480
MAX_CONSTANT_BUFFER_SIZE: 1502427136
MAX_CONSTANT_ARGS: 16
LOCAL_MEM_SIZE: 32768
ERROR_CORRECTION_SUPPORT: 0
PROFILING_TIMER_RESOLUTION: 0
ENDIAN_LITTLE: 1
AVAILABLE: 1
COMPILER_AVAILABLE: 1
MAX_WORK_GROUP_SIZES: 256 256 256 
---------------------------------------------------------------------
*** build of 'sum' on 'AMD Radeon(TM) HD8800 Series (PITCAIRN / DRM 3.9.0 / 4.10.0-35-generic, LLVM 5.0.0)' said:

*** (end of message)
0.000634 s
18.924809 GB/s
GOOD

Which are tests that were suggested here. So tests within Julia fail, and tests outside of it have passed.

EDIT: On a different Ubuntu machine (this one 16.4; the one from above was 17.4), I followed the instructions here (primary difference was Paulo Miguel's PPA instead of Oibaf's). I'm again able to pass tests, compile and run code on the GPU (eg, hello.c from the above link), etc. But I now get a ReadOnlyMemoryError() when precompiling CLBLAS, now without a segfault upon trying to exit. I haven't tested CLBLAS. Perhaps I should file an issue there.

I'll start reading about OpenCL in my spare time.

chriselrod avatar Oct 02 '17 11:10 chriselrod

I see you're using Mesa's Clover. Note that Clover is pretty much undeveloped now, and in my experience, segfaults if you so much as look at it funny. I could never get it to work with OpenCL.jl properly, similar situation as you're seeing (I tried with both a R9 390 and RX 480).

What I recommend is using either AMDGPU-PRO (I know, proprietary) if you can get it to work, or go the ROCm route (which I have not tested yet, but will soon). I don't think your FX CPU is going to cut it for ROCm unfortunately, as FX CPUs use PCIe 2.0, which does not support the Atomic Completions that ROCm requires.

AMDGPU-PRO, however, works nicely if you can get it to work with your kernel. It is installable as a .deb that you can get from AMD's website, and is very compatible with OpenCL.jl and similar packages.

jpsamaroo avatar Oct 03 '17 18:10 jpsamaroo

Okay, cool -- trying amdgpu-pro on the second computer, I'm able to build CLBLAS but ran into two issues:

  1. https://github.com/JuliaGPU/OpenCL.jl/issues/123
  2. The graphics no longer seemed to be working, and I could not log in to the GUI (screen would flash and I'd be back at the login screen). I switched to tty1 to test.

"1)" is the bigger problem because I can work around "2)" through either (a) bringing a laptop for a proper screen or (b) simply un- and re-installing amdgpu-pro as needed. Per that issue, I'll try removing all of my other OpenCL packages and try again.

EDIT: a) Forgot to mention earlier, using GPUArrays I also got an error saying no method similar(GPUArray) found when trying to follow the example in the readme: GPUArray(rand(Float32, 32, 32)) b) No luck after removing mesa and purging the repository.

Also, neither the GPU of the computer I'm on currently ( Radeon HD 8490 / R5 235X OEM ) or my home computer are listed among the supported GPUs on the AMDGPU-PRO page. I suspect that may be a problem. Both GPUs you mentioned, however, are supported.

chriselrod avatar Oct 03 '17 19:10 chriselrod

Regarding 1), it could definitely be the choice of GPU you're using. Although if it's the case that AMDGPU-PRO doesn't support it, I unfortunately can't help you, other than recommending you upgrade to something newer, like the Polaris or Vega line.

Regarding 2), it's probably some conflict between your AMDGPU kernel driver and the AMDGPU-PRO kernel module (it only works on specific versions). This is also possibly what's causing 1), if you're lucky (since you can just downgrade/upgrade your kernel to one that it supports).

And regarding your edit a) about GPUArrays, you may want to file an issue about that. I know Simon has been doing a lot of work on GPUArrays that moves functionality out of that package into other (more specific) packages, so he probably just forgot to update the docs. EDIT: Actually that might be a real bug, not just docs. Please file a ticket either way :smile:

Either way, if you want to take this offline (since this isn't really an OpenCL.jl issue), feel free to email me at jpsamaroo -at- gmail -dot- com.

jpsamaroo avatar Oct 03 '17 22:10 jpsamaroo

I see segmentation fault as well:

julia> Pkg.test("OpenCL")
INFO: Testing OpenCL
Test Summary: | Pass  Total
layout        |    2      2
Platform Info: Error During Test
  Test threw an exception of type ReadOnlyMemoryError
  Expression: length(cl.platforms()) == cl.num_platforms()
  ReadOnlyMemoryError()
  Stacktrace:
   [1] clGetPlatformIDs(::Int64, ::Ptr{Void}, ::Base.RefValue{UInt32}) at /home/sambit/.julia/v0.6/OpenCL/src/api.jl:15
   [2] macro expansion at /home/sambit/.julia/v0.6/OpenCL/src/macros.jl:4 [inlined]
   [3] platforms() at /home/sambit/.julia/v0.6/OpenCL/src/platform.jl:21
   [4] macro expansion at /home/sambit/.julia/v0.6/OpenCL/test/test_platform.jl:3 [inlined]
   [5] macro expansion at ./test.jl:860 [inlined]
   [6] macro expansion at /home/sambit/.julia/v0.6/OpenCL/test/test_platform.jl:2 [inlined]
   [7] macro expansion at ./test.jl:860 [inlined]
   [8] anonymous at ./<missing>:?
Platform Info: Error During Test
  Got an exception of type OpenCL.cl.CLError outside of a @test
  CLError(code=-1001, CL_PLATFORM_NOT_FOUND_KHR)
  Stacktrace:
   [1] macro expansion at /home/sambit/.julia/v0.6/OpenCL/src/macros.jl:6 [inlined]
   [2] platforms() at /home/sambit/.julia/v0.6/OpenCL/src/platform.jl:21
   [3] macro expansion at /home/sambit/.julia/v0.6/OpenCL/test/test_platform.jl:4 [inlined]
   [4] macro expansion at ./test.jl:860 [inlined]
   [5] macro expansion at /home/sambit/.julia/v0.6/OpenCL/test/test_platform.jl:2 [inlined]
   [6] macro expansion at ./test.jl:860 [inlined]
   [7] anonymous at ./<missing>:?
   [8] include_from_node1(::String) at ./loading.jl:576
   [9] include(::String) at ./sysimg.jl:14
   [10] include_from_node1(::String) at ./loading.jl:576
   [11] include(::String) at ./sysimg.jl:14
   [12] process_options(::Base.JLOptions) at ./client.jl:305
   [13] _start() at ./client.jl:371
Platform Equality: Error During Test
  Got an exception of type OpenCL.cl.CLError outside of a @test
  CLError(code=-1001, CL_PLATFORM_NOT_FOUND_KHR)
  Stacktrace:
   [1] macro expansion at /home/sambit/.julia/v0.6/OpenCL/src/macros.jl:6 [inlined]
   [2] platforms() at /home/sambit/.julia/v0.6/OpenCL/src/platform.jl:21
   [3] macro expansion at /home/sambit/.julia/v0.6/OpenCL/test/test_platform.jl:17 [inlined]
   [4] macro expansion at ./test.jl:860 [inlined]
   [5] macro expansion at /home/sambit/.julia/v0.6/OpenCL/test/test_platform.jl:16 [inlined]
   [6] macro expansion at ./test.jl:860 [inlined]
   [7] anonymous at ./<missing>:?
   [8] include_from_node1(::String) at ./loading.jl:576
   [9] include(::String) at ./sysimg.jl:14
   [10] include_from_node1(::String) at ./loading.jl:576
   [11] include(::String) at ./sysimg.jl:14
   [12] process_options(::Base.JLOptions) at ./client.jl:305
   [13] _start() at ./client.jl:371
Test Summary:       | Error  Total
OpenCL.Platform     |     3      3
  Platform Info     |     2      2
  Platform Equality |     1      1
ERROR: LoadError: LoadError: Some tests did not pass: 0 passed, 0 failed, 3 errored, 0 broken.
while loading /home/sambit/.julia/v0.6/OpenCL/test/test_platform.jl, in expression starting on line 1
while loading /home/sambit/.julia/v0.6/OpenCL/test/runtests.jl, in expression starting on line 24

signal (11): Segmentation fault
while loading no file, in expression starting on line 0
unknown function (ip: 0x7f5b9f255cb9)
_ZN4llvm2cl19generic_parser_base10findOptionEPKc at /usr/local/bin/../lib/julia/libLLVM-3.9.so (unknown line)
unknown function (ip: 0x7f5b9dc5ce8c)
_ZN4llvm19MachinePassRegistry6RemoveEPNS_23MachinePassRegistryNodeE at /usr/lib/x86_64-linux-gnu/libLLVM-5.0.so.1 (unknown line)
__run_exit_handlers at /build/glibc-itYbWN/glibc-2.26/stdlib/exit.c:83
exit at /build/glibc-itYbWN/glibc-2.26/stdlib/exit.c:105
jl_exit at /buildworker/worker/package_linux64/build/src/jl_uv.c:552
_start at ./client.jl:419
unknown function (ip: 0x7f5b99c5fd28)
jl_call_fptr_internal at /buildworker/worker/package_linux64/build/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /buildworker/worker/package_linux64/build/src/julia_internal.h:358 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:1926
jl_apply at /buildworker/worker/package_linux64/build/ui/../src/julia.h:1424 [inlined]
true_main at /buildworker/worker/package_linux64/build/ui/repl.c:127
main at /buildworker/worker/package_linux64/build/ui/repl.c:264
__libc_start_main at /build/glibc-itYbWN/glibc-2.26/csu/../csu/libc-start.c:308
unknown function (ip: 0x4016bc)
Allocations: 2324417 (Pool: 2323188; Big: 1229); GC: 2
======================================================================================[ ERROR: OpenCL ]======================================================================================

failed process: Process(`/usr/local/bin/julia -Cx86-64 -J/usr/local/lib/julia/sys.so --compile=yes --depwarn=yes --check-bounds=yes --code-coverage=none --color=yes --compilecache=yes /home/sambit/.julia/v0.6/OpenCL/test/runtests.jl`, ProcessSignaled(11)) [0]

=============================================================================================================================================================================================
INFO: No packages to install, update or remove
ERROR: OpenCL had test errors

julia> 

sambitdash avatar Feb 16 '18 05:02 sambitdash

clinfo:

Number of platforms                               1
  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 Mesa 17.2.8
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA

  Platform Name                                   Clover
Number of devices                                 1
  Device Name                                     AMD Radeon (TM) R7 M360 (AMD ICELAND / DRM 3.18.0 / 4.13.0-32-generic, LLVM 5.0.0)
  Device Vendor                                   AMD
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.1 Mesa 17.2.8
  Driver Version                                  17.2.8
  Device OpenCL C Version                         OpenCL C 1.1 
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Max compute units                               6
  Max clock frequency                             980MHz
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple              64
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 0 / 0        (n/a)
    float                                                4 / 4       
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Address bits                                    64, Little-Endian
  Global memory size                              4293337088 (3.998GiB)
  Error Correction support                        No
  Max memory allocation                           3005335961 (2.799GiB)
  Unified memory for Host and Device              No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Global Memory cache type                        None
  Image support                                   No
  Local memory type                               Local
  Local memory size                               32768 (32KiB)
  Max constant buffer size                        2147483647 (2GiB)
  Max number of constant args                     16
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        No
    Profiling                                     Yes
  Profiling timer resolution                      0ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  Device Available                                Yes
  Compiler Available                              Yes
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_fp64

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  Clover
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [MESA]
  clCreateContext(NULL, ...) [default]            Success [MESA]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 Clover
    Device Name                                   AMD Radeon (TM) R7 M360 (AMD ICELAND / DRM 3.18.0 / 4.13.0-32-generic, LLVM 5.0.0)
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 Clover
    Device Name                                   AMD Radeon (TM) R7 M360 (AMD ICELAND / DRM 3.18.0 / 4.13.0-32-generic, LLVM 5.0.0)

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.11
  ICD loader Profile                              OpenCL 2.1

sambitdash avatar Feb 16 '18 05:02 sambitdash

Try building Julia from source with this in your Make.user:

override USE_LLVM_SHLIB = 0

I had a lot of different types of Mesa-related segfaults that were all solved by statically linking Julia and llvm. For example, running Pkg.test("OpenCL") from Julia built normally (dynamically linked):

julia> Pkg.test("OpenCL")
INFO: Testing OpenCL
Test Summary: | Pass  Total
layout        |    2      2
Cannot find target for triple amdgcn-- No available targets are compatible with this triple.

signal (11): Segmentation fault
while loading /home/celrod/.julia/v0.6/OpenCL/test/test_platform.jl, in expression starting on line 1
LLVMCreateTargetMachine at /home/celrod/Documents/prog/julia/usr/bin/../lib/libLLVM-3.9.so (unknown line)
unknown function (ip: 0x7f0707c55561)
unknown function (ip: 0x7f0707c0479a)
radeon_drm_winsys_create at /usr/lib64/gallium-pipe/pipe_radeonsi.so (unknown line)
unknown function (ip: 0x7f0707b29fa0)
unknown function (ip: 0x7f0728892600)
unknown function (ip: 0x7f0728892cff)
unknown function (ip: 0x7f073e730c12)
unknown function (ip: 0x7f073e735b69)
_dl_catch_error at /lib64/libc.so.6 (unknown line)
unknown function (ip: 0x7f073e735078)
unknown function (ip: 0x7f073de10f95)
_dl_catch_error at /lib64/libc.so.6 (unknown line)
unknown function (ip: 0x7f073de11714)
dlopen at /lib64/libdl.so.2 (unknown line)
unknown function (ip: 0x7f0728b6da81)
clGetPlatformIDs at /lib64/libOpenCL.so.1 (unknown line)
clGetPlatformIDs at /home/celrod/.julia/v0.6/OpenCL/src/api.jl:15
macro expansion at /home/celrod/.julia/v0.6/OpenCL/src/macros.jl:4 [inlined]
platforms at /home/celrod/.julia/v0.6/OpenCL/src/platform.jl:21
macro expansion at /home/celrod/.julia/v0.6/OpenCL/test/test_platform.jl:3 [inlined]
macro expansion at ./test.jl:860 [inlined]
macro expansion at /home/celrod/.julia/v0.6/OpenCL/test/test_platform.jl:2 [inlined]
macro expansion at ./test.jl:860 [inlined]
anonymous at ./<missing> (unknown line)
jl_call_fptr_internal at /home/celrod/Documents/prog/julia/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/celrod/Documents/prog/julia/src/julia_internal.h:358 [inlined]
jl_toplevel_eval_flex at /home/celrod/Documents/prog/julia/src/toplevel.c:589
jl_parse_eval_all at /home/celrod/Documents/prog/julia/src/ast.c:873
jl_load at /home/celrod/Documents/prog/julia/src/toplevel.c:616
include_from_node1 at ./loading.jl:576
unknown function (ip: 0x7f07140bd8a2)
jl_call_fptr_internal at /home/celrod/Documents/prog/julia/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/celrod/Documents/prog/julia/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/celrod/Documents/prog/julia/src/gf.c:1926
include at ./sysimg.jl:14
unknown function (ip: 0x7f0731931abb)
jl_call_fptr_internal at /home/celrod/Documents/prog/julia/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/celrod/Documents/prog/julia/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/celrod/Documents/prog/julia/src/gf.c:1926
do_call at /home/celrod/Documents/prog/julia/src/interpreter.c:75
eval at /home/celrod/Documents/prog/julia/src/interpreter.c:242
jl_interpret_toplevel_expr at /home/celrod/Documents/prog/julia/src/interpreter.c:34
jl_toplevel_eval_flex at /home/celrod/Documents/prog/julia/src/toplevel.c:577
jl_eval_module_expr at /home/celrod/Documents/prog/julia/src/toplevel.c:205
jl_toplevel_eval_flex at /home/celrod/Documents/prog/julia/src/toplevel.c:480
jl_parse_eval_all at /home/celrod/Documents/prog/julia/src/ast.c:873
jl_load at /home/celrod/Documents/prog/julia/src/toplevel.c:616
include_from_node1 at ./loading.jl:576
unknown function (ip: 0x7f0731a9a10b)
jl_call_fptr_internal at /home/celrod/Documents/prog/julia/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/celrod/Documents/prog/julia/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/celrod/Documents/prog/julia/src/gf.c:1926
include at ./sysimg.jl:14
unknown function (ip: 0x7f0731931abb)
jl_call_fptr_internal at /home/celrod/Documents/prog/julia/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/celrod/Documents/prog/julia/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/celrod/Documents/prog/julia/src/gf.c:1926
process_options at ./client.jl:305
_start at ./client.jl:371
unknown function (ip: 0x7f0731aa8948)
jl_call_fptr_internal at /home/celrod/Documents/prog/julia/src/julia_internal.h:339 [inlined]
jl_call_method_internal at /home/celrod/Documents/prog/julia/src/julia_internal.h:358 [inlined]
jl_apply_generic at /home/celrod/Documents/prog/julia/src/gf.c:1926
jl_apply at /home/celrod/Documents/prog/julia/ui/../src/julia.h:1424 [inlined]
true_main at /home/celrod/Documents/prog/julia/ui/repl.c:127
main at /home/celrod/Documents/prog/julia/ui/repl.c:264
__libc_start_main at /lib64/libc.so.6 (unknown line)
unknown function (ip: 0x401519)
Allocations: 3989200 (Pool: 3987906; Big: 1294); GC: 6
=================================================================[ ERROR: OpenCL ]=================================================================

failed process: Process(`/home/celrod/Documents/prog/julia/usr/bin/julia -Cnative -J/home/celrod/Documents/prog/julia/usr/lib/julia/sys.so --compile=yes --depwarn=yes --check-bounds=yes --code-coverage=none --color=yes --compilecache=yes /home/celrod/.julia/v0.6/OpenCL/test/runtests.jl`, ProcessSignaled(11)) [0]

===================================================================================================================================================
INFO: No packages to install, update or remove
ERROR: OpenCL had test errors

and when running it with a statically linked LLVM:

julia> Pkg.test("OpenCL")
INFO: Testing OpenCL
Test Summary: | Pass  Total
layout        |    2      2
Test Summary:   | Pass  Total
OpenCL.Platform |   13     13
Test Summary:  | Pass  Total
OpenCL.Context |   27     27
Test Summary: | Pass  Total
OpenCL.Device |   40     40
Test Summary:   | Pass  Total
OpenCL.CmdQueue |   21     21
Test Summary: | Pass  Total
OpenCL.Minver |   10     10
Test Callback
Test Summary: | Pass  Total
OpenCL.Event  |   16     16
Test Summary:  | Pass  Total
OpenCL.Program |   22     22
OpenCL.Kernel info: Error During Test
  Test threw an exception of type OutOfMemoryError
  Expression: typeof(k[:attributes]) == String
  OutOfMemoryError()
  Stacktrace:
   [1] info(::OpenCL.cl.Kernel, ::Symbol) at /home/celrod/.julia/v0.6/OpenCL/src/kernel.jl:437
   [2] macro expansion at /home/celrod/.julia/v0.6/OpenCL/test/test_kernel.jl:53 [inlined]
   [3] macro expansion at ./test.jl:860 [inlined]
   [4] macro expansion at /home/celrod/.julia/v0.6/OpenCL/test/test_kernel.jl:39 [inlined]
   [5] macro expansion at ./test.jl:860 [inlined]
   [6] anonymous at ./<missing>:?
OpenCL.Kernel: Test Failed
  Expression: r == [1.0f0, 2.0f0, 3.0f0, 22.0f0]
   Evaluated: Float32[3.0, 0.0, 0.0, 0.0] == Float32[1.0, 2.0, 3.0, 22.0]
Stacktrace:
 [1] macro expansion at /home/celrod/.julia/v0.6/OpenCL/test/test_kernel.jl:258 [inlined]
 [2] macro expansion at ./test.jl:860 [inlined]
 [3] anonymous at ./<missing>:?
Test Summary:                      | Pass  Fail  Error  Total
OpenCL.Kernel                      |   37     1      1     39
  OpenCL.Kernel constructor        |    2                   2
  OpenCL.Kernel info               |    4            1      5
  OpenCL.Kernel mem/workgroup size |   14                  14
  OpenCL.Kernel set_arg!/set_args! |    9                   9
  OpenCL.Kernel enqueue_kernel     |    7                   7
ERROR: LoadError: LoadError: Some tests did not pass: 37 passed, 1 failed, 1 errored, 0 broken.
while loading /home/celrod/.julia/v0.6/OpenCL/test/test_kernel.jl, in expression starting on line 7
while loading /home/celrod/.julia/v0.6/OpenCL/test/runtests.jl, in expression starting on line 31
=================================================================[ ERROR: OpenCL ]=================================================================

failed process: Process(`/home/celrod/Documents/prog/julia-static-llvm/usr/bin/julia -Cnative -J/home/celrod/Documents/prog/julia-static-llvm/usr/lib/julia/sys.so --compile=yes --depwarn=yes --check-bounds=yes --code-coverage=none --color=yes --compilecache=yes /home/celrod/.julia/v0.6/OpenCL/test/runtests.jl`, ProcessExited(1)) [1]

===================================================================================================================================================
INFO: No packages to install, update or remove
ERROR: OpenCL had test errors

chriselrod avatar Feb 16 '18 07:02 chriselrod