gpuPlotGenerator
gpuPlotGenerator copied to clipboard
Add macOS support
$ bin/gpuPlotGenerator.exe listPlatforms
-------------------------
GPU plot generator v4.0.3
-------------------------
Author: Cryo
Bitcoin: 138gMBhCrNkbaiTCmUhP9HLU9xwn5QKZgD
Burst: BURST-YA29-QCEW-QXC3-BKXDL
----
Platforms number: 1
----
Id: 0
Name: Apple
Vendor: Apple
Version: OpenCL 1.2 (Apr 4 2017 19:07:42)
$ bin/gpuPlotGenerator.exe listDevices Apple
-------------------------
GPU plot generator v4.0.3
-------------------------
Author: Cryo
Bitcoin: 138gMBhCrNkbaiTCmUhP9HLU9xwn5QKZgD
Burst: BURST-YA29-QCEW-QXC3-BKXDL
----
Devices number: 2
----
Id: 0
Type: CPU
Name: Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
Vendor: Intel
Version: OpenCL 1.2
Driver version: 1.1
Max clock frequency: 2200MHz
Max compute units: 8
Global memory size: 16GB 0MB 0KB
Max memory allocation size: 4GB 0MB 0KB
Max work group size: 1024
Local memory size: 32KB
Max work-item sizes: (1024, 1, 1)
----
Id: 1
Type: GPU
Name: Iris Pro
Vendor: Intel
Version: OpenCL 1.2
Driver version: 1.2(Apr 22 2017 16:00:44)
Max clock frequency: 1200MHz
Max compute units: 40
Global memory size: 1GB 512MB 0KB
Max memory allocation size: 384MB 0KB
Max work group size: 512
Local memory size: 64KB
Max work-item sizes: (512, 512, 512)
$ bin/gpuPlotGenerator.exe generate direct 123456_0_50000_5000 123456_50000_10000_2000
-------------------------
GPU plot generator v4.0.3
-------------------------
Author: Cryo
Bitcoin: 138gMBhCrNkbaiTCmUhP9HLU9xwn5QKZgD
Burst: BURST-YA29-QCEW-QXC3-BKXDL
----
Loading platforms...
Loading devices...
Loading devices configurations...
Initializing generation devices...
[0] Device: Iris Pro (OpenCL 1.2 )
[0] Device memory: 384MB
[0] CPU memory: 384MB
Initializing generation contexts...
[0] Path: 123456_0_50000_50000
[0] Nonces: 0 to 49999 (12GB 212MB)
[0] CPU memory: 1GB 226MB
[1] Path: 123456_50000_10000_10000
[1] Nonces: 50000 to 59999 (2GB 452MB)
[1] CPU memory: 500MB
----
Devices number: 1
Plots files number: 2
Total nonces number: 60000
CPU memory: 2GB 86MB
----
Generating nonces...
0.00% (0/60000 remaining nonces), 0.00 nonces/minutes, ETA: 5w 6d 16h 0m 0s...Abort trap: 6
How can I determine the problem? I am not really familiar with GPGPU programming.
I reintegrated the include correction from constants.h
(@see 1c3ab770b5b0de8e82196467d64ac6c95d05e09a).
About the MacOS support, it can be achieved by a symlink to the two paths via the "OPENCL_INCLUDE" and "OPENCL_LIB" env vars. It doesn't seem like a good idea to ship platform dependent modifications. I can include an explanation in the README.md
if you want.
@bhamon how about this changes?
ifeq ($(shell uname),Darwin)
LD_FLAGS = -fPIC -L$(OPENCL_LIB) -framework OpenCL -m$(PLATFORM)
else
LD_FLAGS = -fPIC -L$(OPENCL_LIB) -lOpenCL -m$(PLATFORM)
endif
And can you help me with this?
Generating nonces...
0.00% (0/60000 remaining nonces), 0.00 nonces/minutes, ETA: 5w 6d 16h 0m 0s...Abort trap: 6
@bhamon can you give me suggest how to debug this issue or at least get stacktrace?
Just tried:
$ lldb bin/gpuPlotGenerator.exe generate buffer xxxxxxxxxxxxxxx_0_32768_32768
Got:
Process 3519 launched: '/Users/k06a/gpuPlotGenerator/bin/gpuPlotGenerator.exe' (x86_64)
-------------------------
GPU plot generator v4.0.3
-------------------------
Author: Cryo
Bitcoin: 138gMBhCrNkbaiTCmUhP9HLU9xwn5QKZgD
Burst: BURST-YA29-QCEW-QXC3-BKXDL
----
Loading platforms...
Loading devices...
Loading devices configurations...
Initializing generation devices...
[0] Device: Iris Pro (OpenCL 1.2 )
[0] Device memory: 384MB
[0] CPU memory: 384MB
Initializing generation contexts...
[0] Path: xxxxxxxxxxxxxxx_0_32768_32768
[0] Nonces: 0 to 32767 (8GB 0MB)
[0] CPU memory: 8GB 0MB
----
Devices number: 1
Plots files number: 1
Total nonces number: 32768
CPU memory: 8GB 384MB
----
Generating nonces...
0.00% (0/32768 remaining nonces), 0.00 nonces/minutes, ETA: 3w 1d 18h 8m 0s...Process 3519 stopped
* thread #5, queue = 'opencl_runtime', stop reason = signal SIGABRT
frame #0: 0x00007fffd1d99d42 libsystem_kernel.dylib`__pthread_kill + 10
libsystem_kernel.dylib`__pthread_kill:
-> 0x7fffd1d99d42 <+10>: jae 0x7fffd1d99d4c ; <+20>
0x7fffd1d99d44 <+12>: movq %rax, %rdi
0x7fffd1d99d47 <+15>: jmp 0x7fffd1d92caf ; cerror_nocancel
0x7fffd1d99d4c <+20>: retq
@bhamon here is call stack:
Thread 2 Crashed:: Dispatch queue: opencl_runtime
0 libsystem_kernel.dylib 0x00007fffd1d99d42 __pthread_kill + 10
1 libsystem_pthread.dylib 0x00007fffd1e87457 pthread_kill + 90
2 libsystem_c.dylib 0x00007fffd1cff420 abort + 129
3 libGPUSupportMercury.dylib 0x00007fffca1bffbf gpusGenerateCrashLog + 158
4 com.apple.driver.AppleIntelHD5000GraphicsGLDriver 0x000000010400a09b gpusKillClientExt + 9
5 libGPUSupportMercury.dylib 0x00007fffca1c0983 gpusQueueSubmitDataBuffers + 168
6 com.apple.driver.AppleIntelHD5000GraphicsGLDriver 0x0000000104055011 IntelCLCommandBuffer::getNew(GLDQueueRec*) + 31
7 com.apple.driver.AppleIntelHD5000GraphicsGLDriver 0x0000000104054f79 intelSubmitCLCommands(GLDQueueRec*, unsigned int) + 65
8 com.apple.driver.AppleIntelHD5000GraphicsGLDriver 0x000000010405b081 CHAL_INTEL::ChalContext::ChalFlush() + 83
9 com.apple.driver.AppleIntelHD5000GraphicsGLDriver 0x00000001040552a3 gldFinishQueue + 43
10 com.apple.opencl 0x00007fffc08b9b37 0x7fffc08b8000 + 6967
11 com.apple.opencl 0x00007fffc08ba000 0x7fffc08b8000 + 8192
12 com.apple.opencl 0x00007fffc08d7cca 0x7fffc08b8000 + 130250
13 com.apple.opencl 0x00007fffc08db29d 0x7fffc08b8000 + 144029
14 libdispatch.dylib 0x00007fffd1c358fc _dispatch_client_callout + 8
15 libdispatch.dylib 0x00007fffd1c36536 _dispatch_barrier_sync_f_invoke + 83
16 com.apple.opencl 0x00007fffc08db11d 0x7fffc08b8000 + 143645
17 com.apple.opencl 0x00007fffc08d6da6 0x7fffc08b8000 + 126374
18 com.apple.opencl 0x00007fffc08cc1df clEnqueueReadBuffer + 813
19 gpuPlotGenerator.exe 0x0000000102741a3b cryo::gpuPlotGenerator::GenerationDevice::bufferPlots() + 107
20 gpuPlotGenerator.exe 0x00000001027328b5 cryo::gpuPlotGenerator::writeNonces(std::exception_ptr&, std::__1::mutex&, std::__1::condition_variable&, std::__1::list<std::__1::shared_ptr<cryo::gpuPlotGenerator::GenerationContext>, std::__1::allocator<std::__1::shared_ptr<cryo::gpuPlotGenerator::GenerationContext> > >&, std::__1::shared_ptr<cryo::gpuPlotGenerator::GenerationContext>&) + 293
21 gpuPlotGenerator.exe 0x0000000102733d7b void* std::__1::__thread_proxy<std::__1::tuple<cryo::gpuPlotGenerator::CommandGenerate::execute(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&)::$_2, std::__1::shared_ptr<cryo::gpuPlotGenerator::GenerationContext> > >(void*) + 139
22 libsystem_pthread.dylib 0x00007fffd1e8493b _pthread_body + 180
23 libsystem_pthread.dylib 0x00007fffd1e84887 _pthread_start + 286
24 libsystem_pthread.dylib 0x00007fffd1e8408d thread_start + 13
May be this is a reason: https://stackoverflow.com/a/43991502/440168
@k06a I'm ok with the change in the Makefile, I will push it soon.
About your problem, can you give me the content of your configuration file? I suppose you try to use your primary graphic card as a generator (the one that is used by your display). If that's the case, there is a high chance that the parameter "hashesNumber" needs to be lowered (you can try a value of "4" to begin with). The "hashesNumber" parameter reflects the stress on the graphic card. To prevent the system watchdog to suspend the generation process I chunck it to smaller pieces (ideally powers of 2).
@bhamon my configuration if mostly recommended:
0 1 1536 384 8192
My device is:
Intel Iris Pro 1536 MB
@k06a Have you tried to change the 8192 to 4?
Just tried config:
0 1 1536 384 4
And got:
$ bin/gpuPlotGenerator.exe generate direct 18xxx_0_131072_65536
-------------------------
GPU plot generator v4.0.3
-------------------------
Author: Cryo
Bitcoin: 138gMBhCrNkbaiTCmUhP9HLU9xwn5QKZgD
Burst: BURST-YA29-QCEW-QXC3-BKXDL
----
Loading platforms...
Loading devices...
Loading devices configurations...
Initializing generation devices...
[0] Device: Iris Pro (OpenCL 1.2 )
[0] Device memory: 384MB
[0] CPU memory: 384MB
Initializing generation contexts...
[0] Path: 189xxxxxx_0_131072_131072
[0] Nonces: 0 to 131071 (32GB 0MB)
[0] CPU memory: 16GB 0MB
----
Devices number: 1
Plots files number: 1
Total nonces number: 131072
CPU memory: 16GB 384MB
----
Generating nonces...
9.38% (12288/131072 remaining nonces), 11170.91 nonces/minutes, ETA: 10m 38s...
Interesting fact, that https://github.com/r-majere/mjminer works for me at same speed on CPU when using AVX2 instruction set:
Using AVX2 core.
Creating plots for nonces 0 to 131072 (34 GB) using 32768 MB memory and 8 threads
1.03% completed, 11505 nonces/minute, 0:11 left
@bhamon why your app suggests me to use 8192 instead on 4? :)
@k06a Good news, it works.
About the performances, OpenCL on your CPU (embedded GPU) can't go really any faster than a well optimized AVX2 implementation. The GPU plot generator is mainly targeted for dedicated GPUs.
About the auto-detection feature, I don't have any easy mean to detect whether the GPU is tied to your display or not. So by default I suggest 8192, and I added an entry in the FAQ (in README.md) to help solving this particular problem.
@bhamon are you sure you don't wanna merge include-related changes? This will make OSX compilation much harder (I am talking about hard linking dirs and files)
Did anyone make a osx build? I have a hackintosh with a R290 and it would be great to use that to plot with..
@gateway this branch is fully compatible with macOS: https://github.com/k06a/gpuPlotGenerator/tree/feature/macos
It is partially merged in this repo. You can see all changes on third tab at top of this page: https://github.com/bhamon/gpuPlotGenerator/pull/17/files
@k06a I'm looking at a cmake integration. Thus, it would be a lot more flexible to build on different OSs.
@gateway I don't own a Mac, but I'll borrow one to put a OSX built version for the next release ;)
@bhamon let me know and I can beta test this! 🍻
@k06a @gateway The latest release (v4.1.0) embed the new CMake build system. Also, it has a native support for MacOS (ie. #include <OpenCL/cl.h>
). I don't have time to test it for now. I'll provide MacOS binaries asap. In the meantime, you can compile it from sources.