Strange std::runtime_error - kernel queueing failed during initilization
First time I got infinite waiting, so restarted and got this error.
$ ./profanity.x64 --matching dead
Mode: matching
Devices:
GPU0: Intel(R) HD Graphics 630, 1610612736 bytes available, 24 compute units (precompiled = yes)
GPU1: AMD Radeon Pro 560 Compute Engine, 4294967296 bytes available, 16 compute units (precompiled = yes)
Initializing OpenCL...
Creating context...OK
Loading kernel from binary...OK
Building program...OK
Initializing devices...
This can take a minute or two. The number of objects initialized on each
device is equal to inverse-size * inverse-multiple. To lower
initialization time (and memory footprint) I suggest lowering the
inverse-multiple first. You can do this via the -I switch. Do note that
this might negatively impact your performance.
std::runtime_error - kernel queueing failed during initilization (res = -45)
Just deleted cache-opencl.1023.0 file, this helps. Trying to figure out inverse size option best fits for my MacBook.
Is it true that having 24+16 compute units I should specify -I 40?
Devices:
GPU0: Intel(R) HD Graphics 630, 1610612736 bytes available, 24 compute units (precompiled = no)
GPU1: AMD Radeon Pro 560 Compute Engine, 4294967296 bytes available, 16 compute units (precompiled = no)
Even using 1x40 initialization time is infinite(?):
$ ./profanity.x64 --matching dead -I 40 -i 1
Mode: matching
Devices:
GPU0: Intel(R) HD Graphics 630, 1610612736 bytes available, 24 compute units (precompiled = no)
GPU1: AMD Radeon Pro 560 Compute Engine, 4294967296 bytes available, 16 compute units (precompiled = no)
Initializing OpenCL...
Creating context...OK
Compiling kernel...OK
Building program...OK
Saving program...OK
Initializing devices...
This can take a minute or two. The number of objects initialized on each
device is equal to inverse-size * inverse-multiple. To lower
initialization time (and memory footprint) I suggest lowering the
inverse-multiple first. You can do this via the -I switch. Do note that
this might negatively impact your performance.
Try skipping your integrated GPU, add -s 0 to command line arguments and see if that works.
johguse, a you email? Какой твой email? Или добавь его в профиль, напишу
@johguse same infinitely long for both -s 0 and -s 1 with also -I 16 -i 1.
Any ideas?
I am seeing the same thing.
When I remove the cache file it gets stuck instead.
(lldb) thread info
thread #1: tid = 0x1bf68, 0x00007fff56447a16 libsystem_kernel.dylib`__psynch_cvwait + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
* frame #0: 0x00007fff56447a16 libsystem_kernel.dylib`__psynch_cvwait + 10
frame #1: 0x00007fff56610589 libsystem_pthread.dylib`_pthread_cond_wait + 732
frame #2: 0x00007fff36b7a7f8 OpenCL`___lldb_unnamed_symbol578$$OpenCL + 128
frame #3: 0x00007fff36b7a6dc OpenCL`clWaitForEvents + 185
frame #4: 0x0000000100002512 profanity.x64`Dispatcher::init() + 898
frame #5: 0x0000000100001e4e profanity.x64`Dispatcher::run() + 46
frame #6: 0x000000010000a557 profanity.x64`main + 8711
frame #7: 0x00007fff562f7015 libdyld.dylib`start + 1
It's a long shot but does the same error occur when compiling without any optimizations? I.e, try removing the -O2 flag from the Makefile. I'll see if I can get a hold of similar hardware to try and reproduce the error on to make troubleshooting easier.
I believe this problem might be fixed in version 1.3. I didn't realize I had to manually flush the command queue in OpenCL when I wasn't issuing blocking commands.
Please let me know if the new version works for you.