gpuPlotGenerator Add macOS support

$ bin/gpuPlotGenerator.exe listPlatforms

-------------------------
GPU plot generator v4.0.3
-------------------------
Author:   Cryo
Bitcoin:  138gMBhCrNkbaiTCmUhP9HLU9xwn5QKZgD
Burst:    BURST-YA29-QCEW-QXC3-BKXDL
----
Platforms number: 1
----
Id:       0
Name:     Apple
Vendor:   Apple
Version:  OpenCL 1.2 (Apr  4 2017 19:07:42)

$ bin/gpuPlotGenerator.exe listDevices Apple

-------------------------
GPU plot generator v4.0.3
-------------------------
Author:   Cryo
Bitcoin:  138gMBhCrNkbaiTCmUhP9HLU9xwn5QKZgD
Burst:    BURST-YA29-QCEW-QXC3-BKXDL
----
Devices number: 2
----
Id:                          0
Type:                        CPU
Name:                        Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
Vendor:                      Intel
Version:                     OpenCL 1.2 
Driver version:              1.1
Max clock frequency:         2200MHz
Max compute units:           8
Global memory size:          16GB 0MB 0KB
Max memory allocation size:  4GB 0MB 0KB
Max work group size:         1024
Local memory size:           32KB
Max work-item sizes:         (1024, 1, 1)
----
Id:                          1
Type:                        GPU
Name:                        Iris Pro
Vendor:                      Intel
Version:                     OpenCL 1.2 
Driver version:              1.2(Apr 22 2017 16:00:44)
Max clock frequency:         1200MHz
Max compute units:           40
Global memory size:          1GB 512MB 0KB
Max memory allocation size:  384MB 0KB
Max work group size:         512
Local memory size:           64KB
Max work-item sizes:         (512, 512, 512)

$ bin/gpuPlotGenerator.exe generate direct 123456_0_50000_5000 123456_50000_10000_2000

-------------------------
GPU plot generator v4.0.3
-------------------------
Author:   Cryo
Bitcoin:  138gMBhCrNkbaiTCmUhP9HLU9xwn5QKZgD
Burst:    BURST-YA29-QCEW-QXC3-BKXDL
----
Loading platforms...
Loading devices...
Loading devices configurations...
Initializing generation devices...
    [0] Device: Iris Pro (OpenCL 1.2 )
    [0] Device memory: 384MB
    [0] CPU memory: 384MB
Initializing generation contexts...
    [0] Path: 123456_0_50000_50000
    [0] Nonces: 0 to 49999 (12GB 212MB)
    [0] CPU memory: 1GB 226MB
    [1] Path: 123456_50000_10000_10000
    [1] Nonces: 50000 to 59999 (2GB 452MB)
    [1] CPU memory: 500MB
----
Devices number: 1
Plots files number: 2
Total nonces number: 60000
CPU memory: 2GB 86MB
----
Generating nonces...
0.00% (0/60000 remaining nonces), 0.00 nonces/minutes, ETA: 5w 6d 16h 0m 0s...Abort trap: 6

May 25 '17 06:05 k06a

How can I determine the problem? I am not really familiar with GPGPU programming.

May 25 '17 10:05 k06a

I reintegrated the include correction from constants.h (@see 1c3ab770b5b0de8e82196467d64ac6c95d05e09a).

About the MacOS support, it can be achieved by a symlink to the two paths via the "OPENCL_INCLUDE" and "OPENCL_LIB" env vars. It doesn't seem like a good idea to ship platform dependent modifications. I can include an explanation in the README.md if you want.

May 25 '17 10:05 bhamon

@bhamon how about this changes?

ifeq ($(shell uname),Darwin)
LD_FLAGS = -fPIC -L$(OPENCL_LIB) -framework OpenCL -m$(PLATFORM)
else
LD_FLAGS = -fPIC -L$(OPENCL_LIB) -lOpenCL -m$(PLATFORM)
endif

May 25 '17 10:05 k06a

And can you help me with this?

Generating nonces...
0.00% (0/60000 remaining nonces), 0.00 nonces/minutes, ETA: 5w 6d 16h 0m 0s...Abort trap: 6

May 25 '17 10:05 k06a

@bhamon can you give me suggest how to debug this issue or at least get stacktrace?

May 29 '17 16:05 k06a

Just tried:

$ lldb bin/gpuPlotGenerator.exe generate buffer xxxxxxxxxxxxxxx_0_32768_32768

Got:

Process 3519 launched: '/Users/k06a/gpuPlotGenerator/bin/gpuPlotGenerator.exe' (x86_64)
-------------------------
GPU plot generator v4.0.3
-------------------------
Author:   Cryo
Bitcoin:  138gMBhCrNkbaiTCmUhP9HLU9xwn5QKZgD
Burst:    BURST-YA29-QCEW-QXC3-BKXDL
----
Loading platforms...
Loading devices...
Loading devices configurations...
Initializing generation devices...
    [0] Device: Iris Pro (OpenCL 1.2 )
    [0] Device memory: 384MB
    [0] CPU memory: 384MB
Initializing generation contexts...
    [0] Path: xxxxxxxxxxxxxxx_0_32768_32768
    [0] Nonces: 0 to 32767 (8GB 0MB)
    [0] CPU memory: 8GB 0MB
----
Devices number: 1
Plots files number: 1
Total nonces number: 32768
CPU memory: 8GB 384MB
----
Generating nonces...
0.00% (0/32768 remaining nonces), 0.00 nonces/minutes, ETA: 3w 1d 18h 8m 0s...Process 3519 stopped
* thread #5, queue = 'opencl_runtime', stop reason = signal SIGABRT
    frame #0: 0x00007fffd1d99d42 libsystem_kernel.dylib`__pthread_kill + 10
libsystem_kernel.dylib`__pthread_kill:
->  0x7fffd1d99d42 <+10>: jae    0x7fffd1d99d4c            ; <+20>
    0x7fffd1d99d44 <+12>: movq   %rax, %rdi
    0x7fffd1d99d47 <+15>: jmp    0x7fffd1d92caf            ; cerror_nocancel
    0x7fffd1d99d4c <+20>: retq

May 29 '17 20:05 k06a

@bhamon here is call stack:

Thread 2 Crashed:: Dispatch queue: opencl_runtime
0   libsystem_kernel.dylib        	0x00007fffd1d99d42 __pthread_kill + 10
1   libsystem_pthread.dylib       	0x00007fffd1e87457 pthread_kill + 90
2   libsystem_c.dylib             	0x00007fffd1cff420 abort + 129
3   libGPUSupportMercury.dylib    	0x00007fffca1bffbf gpusGenerateCrashLog + 158
4   com.apple.driver.AppleIntelHD5000GraphicsGLDriver	0x000000010400a09b gpusKillClientExt + 9
5   libGPUSupportMercury.dylib    	0x00007fffca1c0983 gpusQueueSubmitDataBuffers + 168
6   com.apple.driver.AppleIntelHD5000GraphicsGLDriver	0x0000000104055011 IntelCLCommandBuffer::getNew(GLDQueueRec*) + 31
7   com.apple.driver.AppleIntelHD5000GraphicsGLDriver	0x0000000104054f79 intelSubmitCLCommands(GLDQueueRec*, unsigned int) + 65
8   com.apple.driver.AppleIntelHD5000GraphicsGLDriver	0x000000010405b081 CHAL_INTEL::ChalContext::ChalFlush() + 83
9   com.apple.driver.AppleIntelHD5000GraphicsGLDriver	0x00000001040552a3 gldFinishQueue + 43
10  com.apple.opencl              	0x00007fffc08b9b37 0x7fffc08b8000 + 6967
11  com.apple.opencl              	0x00007fffc08ba000 0x7fffc08b8000 + 8192
12  com.apple.opencl              	0x00007fffc08d7cca 0x7fffc08b8000 + 130250
13  com.apple.opencl              	0x00007fffc08db29d 0x7fffc08b8000 + 144029
14  libdispatch.dylib             	0x00007fffd1c358fc _dispatch_client_callout + 8
15  libdispatch.dylib             	0x00007fffd1c36536 _dispatch_barrier_sync_f_invoke + 83
16  com.apple.opencl              	0x00007fffc08db11d 0x7fffc08b8000 + 143645
17  com.apple.opencl              	0x00007fffc08d6da6 0x7fffc08b8000 + 126374
18  com.apple.opencl              	0x00007fffc08cc1df clEnqueueReadBuffer + 813
19  gpuPlotGenerator.exe          	0x0000000102741a3b cryo::gpuPlotGenerator::GenerationDevice::bufferPlots() + 107
20  gpuPlotGenerator.exe          	0x00000001027328b5 cryo::gpuPlotGenerator::writeNonces(std::exception_ptr&, std::__1::mutex&, std::__1::condition_variable&, std::__1::list<std::__1::shared_ptr<cryo::gpuPlotGenerator::GenerationContext>, std::__1::allocator<std::__1::shared_ptr<cryo::gpuPlotGenerator::GenerationContext> > >&, std::__1::shared_ptr<cryo::gpuPlotGenerator::GenerationContext>&) + 293
21  gpuPlotGenerator.exe          	0x0000000102733d7b void* std::__1::__thread_proxy<std::__1::tuple<cryo::gpuPlotGenerator::CommandGenerate::execute(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&)::$_2, std::__1::shared_ptr<cryo::gpuPlotGenerator::GenerationContext> > >(void*) + 139
22  libsystem_pthread.dylib       	0x00007fffd1e8493b _pthread_body + 180
23  libsystem_pthread.dylib       	0x00007fffd1e84887 _pthread_start + 286
24  libsystem_pthread.dylib       	0x00007fffd1e8408d thread_start + 13

Jun 01 '17 14:06 k06a

May be this is a reason: https://stackoverflow.com/a/43991502/440168

Jun 01 '17 14:06 k06a

@k06a I'm ok with the change in the Makefile, I will push it soon.

About your problem, can you give me the content of your configuration file? I suppose you try to use your primary graphic card as a generator (the one that is used by your display). If that's the case, there is a high chance that the parameter "hashesNumber" needs to be lowered (you can try a value of "4" to begin with). The "hashesNumber" parameter reflects the stress on the graphic card. To prevent the system watchdog to suspend the generation process I chunck it to smaller pieces (ideally powers of 2).

Jun 02 '17 06:06 bhamon

@bhamon my configuration if mostly recommended:

0 1 1536 384 8192

My device is:

Intel Iris Pro 1536 MB

Jun 02 '17 14:06 k06a

@k06a Have you tried to change the 8192 to 4?

Jun 02 '17 15:06 bhamon

Just tried config:

0 1 1536 384 4

And got:

$ bin/gpuPlotGenerator.exe generate direct 18xxx_0_131072_65536
-------------------------
GPU plot generator v4.0.3
-------------------------
Author:   Cryo
Bitcoin:  138gMBhCrNkbaiTCmUhP9HLU9xwn5QKZgD
Burst:    BURST-YA29-QCEW-QXC3-BKXDL
----
Loading platforms...
Loading devices...
Loading devices configurations...
Initializing generation devices...
    [0] Device: Iris Pro (OpenCL 1.2 )
    [0] Device memory: 384MB
    [0] CPU memory: 384MB
Initializing generation contexts...
    [0] Path: 189xxxxxx_0_131072_131072
    [0] Nonces: 0 to 131071 (32GB 0MB)
    [0] CPU memory: 16GB 0MB
----
Devices number: 1
Plots files number: 1
Total nonces number: 131072
CPU memory: 16GB 384MB
----
Generating nonces...
9.38% (12288/131072 remaining nonces), 11170.91 nonces/minutes, ETA: 10m 38s...

Jun 02 '17 15:06 k06a

Interesting fact, that https://github.com/r-majere/mjminer works for me at same speed on CPU when using AVX2 instruction set:

Using AVX2 core.
Creating plots for nonces 0 to 131072 (34 GB) using 32768 MB memory and 8 threads
1.03% completed, 11505 nonces/minute, 0:11 left

Jun 02 '17 15:06 k06a

@bhamon why your app suggests me to use 8192 instead on 4? :)

Jun 02 '17 15:06 k06a

@k06a Good news, it works.

About the performances, OpenCL on your CPU (embedded GPU) can't go really any faster than a well optimized AVX2 implementation. The GPU plot generator is mainly targeted for dedicated GPUs.

About the auto-detection feature, I don't have any easy mean to detect whether the GPU is tied to your display or not. So by default I suggest 8192, and I added an entry in the FAQ (in README.md) to help solving this particular problem.

Jun 02 '17 15:06 bhamon

@bhamon are you sure you don't wanna merge include-related changes? This will make OSX compilation much harder (I am talking about hard linking dirs and files)

Jun 02 '17 17:06 k06a

Did anyone make a osx build? I have a hackintosh with a R290 and it would be great to use that to plot with..

Jun 12 '17 22:06 gateway

@gateway this branch is fully compatible with macOS: https://github.com/k06a/gpuPlotGenerator/tree/feature/macos

It is partially merged in this repo. You can see all changes on third tab at top of this page: https://github.com/bhamon/gpuPlotGenerator/pull/17/files

Jun 13 '17 05:06 k06a

@k06a I'm looking at a cmake integration. Thus, it would be a lot more flexible to build on different OSs.

@gateway I don't own a Mac, but I'll borrow one to put a OSX built version for the next release ;)

Jun 13 '17 06:06 bhamon

@bhamon let me know and I can beta test this! 🍻

Jun 13 '17 17:06 gateway

@k06a @gateway The latest release (v4.1.0) embed the new CMake build system. Also, it has a native support for MacOS (ie. #include <OpenCL/cl.h>). I don't have time to test it for now. I'll provide MacOS binaries asap. In the meantime, you can compile it from sources.

Jun 17 '17 22:06 bhamon

gpuPlotGenerator gpuPlotGenerator copied to clipboard

Add macOS support

gpuPlotGenerator
gpuPlotGenerator copied to clipboard