silentarmy icon indicating copy to clipboard operation
silentarmy copied to clipboard

Os x port

Open justvanbloom opened this issue 8 years ago • 29 comments

Dear, I'll do the os x port. Changed compiler structs and opencl headers but get a bunch of errors. Any clue?

justvanbloom avatar Nov 05 '16 15:11 justvanbloom

I could help you if you showed me the errors you see.

mbevand avatar Nov 05 '16 19:11 mbevand

well.. successful build but sa-solver fails now.

./sa-solver -v Solving default all-zero 140-byte header Found 1 OpenCL platform(s) Using GPU device ID 0 Building program OpenCL build failed (-11). Build log follows: input.cl:80:6: warning: no previous prototype for function 'ht_store' uint ht_store(uint round, **global char *ht, uint i, ^ input.cl:415:6: warning: no previous prototype for function 'xor_and_store' uint xor_and_store(uint round, __global char *ht_dst, uint row, ^ input.cl:496:6: warning: no previous prototype for function 'equihash_round' void equihash_round(uint round, __global char ht_src, __global char ht_dst, ^ input.cl:500:11: warning: unused variable 'tlid' uint tlid = get_local_id(0); ^ input.cl:592:63: error: global variables must have a constant address space qualifier __kernel __attribute((reqd_work_group_size(64, 1, 1))) void kernel_round ## 1(**global char *ht_src, __global char ht_dst, __global uint debug) { equihash_round(1, ht_src, ht_dst, debug); } ^ input.cl:592:75: error: expected ';' after top level declarator __kernel __attribute((reqd_work_group_size(64, 1, 1))) void kernel_round ## 1(__global char *ht_src, __global char *ht_dst, __global uint *debug) { equihash_round(1, ht_src, ht_dst, debug); } ^ ;

justvanbloom avatar Nov 07 '16 09:11 justvanbloom

see my changes on https://github.com/justvanbloom/silentarmy/tree/mac

justvanbloom avatar Nov 07 '16 10:11 justvanbloom

Weird. Your system's OpenCL compiler requires definining prototypes? Trying adding them and see what happens...

mbevand avatar Nov 07 '16 23:11 mbevand

ok i will try. but i think its another problem also. global variables must have a constant address space qualifier. but when i add __constant nothing changes. have you a hint?

justvanbloom avatar Nov 08 '16 01:11 justvanbloom

Sorry I don't know. Maybe your OpenCL compiler is confused by the syntax? Try removing the whole "__attribute((reqd_work_group_size(64, 1, 1)))"

mbevand avatar Nov 08 '16 16:11 mbevand

ok, great steps forward now

make runs with only a few warnings now.

have inlcuded ocldump on my fork perhaps this helps

echo 'const char *ocl_code = R"_mrb_(' >_kernel.h cpp input.cl >>_kernel.h echo ')_mrb_";' >>_kernel.h gcc-6 -O2 -std=gnu99 -pedantic -Wextra -Wall -ggdb -Wno-deprecated-declarations -Wno-overlength-strings -I"/System/Library/Frameworks/OpenCL.framework/Headers/" -c -o main.o main.c main.c: In function 'store_encoded_sol': main.c:573:34: warning: left shift of negative value [-Wshift-negative-value] uint32_t mask = ~(-1 << (8 - x_bits_used)); ^~ gcc-6 -O2 -std=gnu99 -pedantic -Wextra -Wall -ggdb -Wno-deprecated-declarations -Wno-overlength-strings -I"/System/Library/Frameworks/OpenCL.framework/Headers/" -c -o blake.o blake.c blake.c:6:25: warning: 'blake2b_block_len' defined but not used [-Wunused-const-variable=] static const uint32_t blake2b_block_len = 128; ^~~~~~~~~~~~~~~~~ gcc-6 -O2 -std=gnu99 -pedantic -Wextra -Wall -ggdb -Wno-deprecated-declarations -Wno-overlength-strings -I"/System/Library/Frameworks/OpenCL.framework/Headers/" -c -o sha256.o sha256.c gcc-6 -o sa-solver main.o blake.o sha256.o -rdynamic -L"/System/Library/Frameworks/OpenCL.framework/Versions/Current/Libraries" -framework OpenCL

so now when i ran

./sa-solver --nonces 1000

i get

Solving default all-zero 140-byte header Building program clCreateKernel (-46)

justvanbloom avatar Nov 09 '16 11:11 justvanbloom

Hi!

I got it working on OSX (hackintosh), at least the solver runs but i am not sure if i have correct results. And i am still seeing the "pipe closed by peer" issue. To get rid of the constant address space error, I removed the define for the KERNEL_ROUND(N) completely and just wrote the 7 method declarations by hand instead. Justvanbloom, you effectively removed the equihash_round calls so that might be the issue.

./sa-solver --list Devices on platform "Apple": ID 0: Intel(R) Core(TM)2 Quad CPU Q9450 @ 2.66GHz ID 1: GeForce GTX 970

./sa-solver --nonces 10 --use 1

Solving default all-zero 140-byte header Building program Hash tables will use 1208.0 MB Running... Nonce 0000000000000000000000000000000000000000000000000000000000000000: 2 sols 265 (probably invalid) solutions were dropped! Nonce 0100000000000000000000000000000000000000000000000000000000000000: 3 sols Nonce 0200000000000000000000000000000000000000000000000000000000000000: 1 sol Nonce 0300000000000000000000000000000000000000000000000000000000000000: 0 sols Nonce 0400000000000000000000000000000000000000000000000000000000000000: 3 sols Nonce 0500000000000000000000000000000000000000000000000000000000000000: 1 sol Nonce 0600000000000000000000000000000000000000000000000000000000000000: 4 sols Nonce 0700000000000000000000000000000000000000000000000000000000000000: 2 sols Nonce 0800000000000000000000000000000000000000000000000000000000000000: 2 sols Nonce 0900000000000000000000000000000000000000000000000000000000000000: 2 sols Total 20 solutions in 1060.5 ms (18.9 Sol/s)

mposch avatar Nov 09 '16 23:11 mposch

Wohoo! Jo let us work on this. Can you make a pull request on my repo (mac branch?)

justvanbloom avatar Nov 10 '16 00:11 justvanbloom

Or post the funcs you mean here...

justvanbloom avatar Nov 10 '16 08:11 justvanbloom

Well it seems that the CPU works, but the GPU (tested on a GTX970 and ATI Radeon 6750) does not produce correct solutions at the moment.

mposch avatar Nov 10 '16 10:11 mposch

in the first moment i thougt ok, but then also saw Share above target failures. but when i run it with --instances=1 and --debug -v then all seeems ok.

justvanbloom avatar Nov 10 '16 11:11 justvanbloom

When testing, should cpu and gpu devices produce the same result? GPU: hackpro:silentarmy matthiasposch$ ./sa-solver --nonce 1 -v -v --use 1 2>&1 | grep Soln: Soln: 0x0: 635 33ddc.... CPU: hackpro:silentarmy matthiasposch$ ./sa-solver --nonce 1 -v -v --use 0 2>&1 | grep Soln: Soln: 0x0: 35c 12d31f a2216....

So at the moment is seems cpu is okay, gpu is false.

mposch avatar Nov 10 '16 13:11 mposch

jvb-MacBook-Pro:silentarmy oliverfolz$ ./sa-solver --nonce 1 -v -v --use 1 2>&1 | grep Soln: Soln: 0x0: 35c 12d31f a2216 cbc99 ... 16fe42 1ae61b

jvb-MacBook-Pro:silentarmy oliverfolz$ ./sa-solver --nonce 1 -v -v --use 0 2>&1 | grep Soln: Soln: 0x0: 35c 12d31f a2216 cbc99 ... 16fe42 1ae61b

working?

justvanbloom avatar Nov 10 '16 13:11 justvanbloom

From one tester: So, now no interesting info:

Devices on platform "Apple": ID 0: Intel(R) Core(TM) i5-3450 CPU @ 3.10GHz - Max rate - 7.78 H/s (or 15.5 sol/s) ID 1: GeForce GTX 970 - Max rate - 1.13 H/s (or 5.2 sol/s)

Total 4.6 sol/s [dev1 5.2] 9 shares Total 4.4 sol/s [dev1 4.9] 9 shares Total 4.4 sol/s [dev1 4.5] 9 shares Total 4.4 sol/s [dev1 4.1] 9 shares Total 4.3 sol/s [dev1 4.3] 10 shares

It is very strange, like a gtx 970 should provide 50 sol/s

justvanbloom avatar Nov 10 '16 14:11 justvanbloom

no, using the gpu i get a different Solution:

hackpro:silentarmy matthiasposch$ ./sa-solver --nonce 1 -v -v --use 1 2>&1 | grep Soln: Soln: 0x0: 635 33ddc ...... 1411ec

./sa-solver --nonce 20 --use 1 (GTX 970) Total 44 solutions in 2133.6 ms (20.6 Sol/s) ./sa-solver --nonce 20 --use 0 Total 42 solutions in 36185.6 ms (1.2 Sol/s)

so if 2 instances were running on the cpu, 40 Sol/s could be possible. But still my version does not calculate correctly. i´ll have to look into it. It seems that your version provides correct results on the gpu?

mposch avatar Nov 10 '16 14:11 mposch

Jep. Mine is fine. Made also implementation for xn-sub. Have uploaded my bins to beta1mac branch of my fork.

justvanbloom avatar Nov 10 '16 15:11 justvanbloom

Just clone this repo and do tests plz.

https://github.com/justvanbloom/silentarmy/tree/beta1mac?files=1

justvanbloom avatar Nov 10 '16 15:11 justvanbloom

I see you guys are making progress, nice :) As long as you produce the same solutions as testing/sols-100 then you know your port is valid.

mbevand avatar Nov 10 '16 20:11 mbevand

Thx! Yes. Verified and working. Mined pover 1000 sol/s on diffrent pools so far. All accepted shares. But i do not stop. I have the feeleing there is more room.

justvanbloom avatar Nov 10 '16 21:11 justvanbloom

No i am still having the same issue that my gpu is not delivering correct results.

mposch avatar Nov 10 '16 21:11 mposch

Maybe i can port the opencl kernel to metal.

justvanbloom avatar Nov 10 '16 22:11 justvanbloom

I am posting my -v -v -v -v output, probably one of you can spot why the results are different (the numbers are a myth to me for now :). Should this output be exactly the same on gpu and cpu?

CPU: Solving nonce 0000000000000000000000000000000000000000000000000000000000000000 Round 0 row 0xab6b7: 00 06000000 5c030000 | aec71a52 3c273545 244a056d 4c34495a 16b68910 c61d1514 01 ________ d3160400 | ae8710c1 b413ee00 7b026443 39948ed7 d17f099b d8368900 02 ________ b8ba0600 | a2fb67db 8daed24d de6a7633 00ea0a98 635e1932 f38a472a 03 ________ f4530700 | a44c80c4 011d7b69 6940fc5c cf394783 3c1c93c1 47274a8a 04 ________ 54a90900 | adb8f7eb 63c99657 ddd5b038 333a034e 533db3a3 e21aea8d *05 ________ 1fd31200 | a1249924 afd702c6 71bf5485 ca719a5e 590c5274 ab4c6600 Dropped: 0 (coll) 0 (stor) Round 1

GPU: Round 0 row 0xa9ea1: 00 04000000 0d3b0100 | a2c2ea03 74b5505d dd3db55c 7ba995ac 0ecb1580 4ab04600 01 ________ 9fd91100 | a6b2b69e adad167d e3d18a86 098d4a2f e8e7ec31 ed7afe00 02 ________ 92181600 | a11fe04e b3cd5611 6c5dc0cd fccb8120 859ef672 9c370e49 *03 ________ 1fd31200 | a174df2b 42e96062 13a82593 bc6ca4ad bcf2b3ef 9b981300 row 0xb63d1:

mposch avatar Nov 10 '16 22:11 mposch

Cpu in mac version (latest v5 silentarmy) is broken. Gpu works fine. Eg Amd radeon 970 Total 53.2 sol/s [dev1 54.1] 12 shares Total 52.7 sol/s [dev1 52.8] 12 shares Amd radeon r9 390 Up to 125 sol/s :D Thx for beeing part. I can make pull request with my latest source @mbevand

justvanbloom avatar Nov 12 '16 11:11 justvanbloom

What type of system do you use to mine- a real mac ? When i find time i will test the kernel under linux.

mposch avatar Nov 12 '16 15:11 mposch

macbook pro, imac and hackintosh.

justvanbloom avatar Nov 12 '16 22:11 justvanbloom

Using linux everything works fine and i get 45 Sols/s on my gtx 970

It seems that cpp (clang) does not like the ## concatenation within a #define. If you make sure that cpp-6 (the brew version) of the preprocessor is used during build, you could revert to the original input.cl kernel.

see https://github.com/mposch/silentarmy/blob/master/Makefile

mposch avatar Nov 13 '16 15:11 mposch

I see @justvanbloom said maybe i can port the opencl kernel to metal. I want to know is it feasible? thx

CoderYgs avatar Nov 22 '16 06:11 CoderYgs

@justvanbloom How u solve Mac "make" err issue ? And How to find path of LIBOPENCL? My whole Mac has no libOpenCL.so

taomanwai avatar Jan 08 '18 03:01 taomanwai