Kangaroo icon indicating copy to clipboard operation
Kangaroo copied to clipboard

Benchmark GPU

Open kpot87 opened this issue 4 years ago • 17 comments

Hi to all! I purpose to post here your results with your GPU and parameters!

kpot87 avatar Apr 23 '20 17:04 kpot87

From Readme example

D:\Kangaroo>Kangaroo.exe -t 0 -gpu in.txt Kangaroo v1.2 Start:49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5E0000000000000000 Stop :49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5EFFFFFFFFFFFFFFFF Keys :2 Number of CPU thread: 0 Range width: 2^64 Number of random walk: 2^20.52 (Max DP=9) DP size: 9 [0xFF80000000000000] GPU: GPU #0 GeForce RTX 2080 (46x64 cores) Grid(92x128) (147.0 MB used) SolveKeyGPU Thread GPU#0: creating kangaroos... SolveKeyGPU Thread GPU#0: 2^20.52 kangaroos in 2865.3ms [200.60 MKey/s][GPU 200.60 MKey/s][Count 2^34.41][Dead 9][01:38][3402.4MB] Key# 0 Pub: 0x0259A3BFDAD718C9D3FAC7C187F1139F0815AC5D923910D516E186AFDA28B221DC Priv: 0x49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5EBB3EF3883C1866D4 [372.63 MKey/s][GPU 372.63 MKey/s][Count 2^32.41][Dead 1][20s][852.8MB] Key# 1 Pub: 0x0335BB25364370D4DD14A9FC2B406D398C4B53C85BE58FCC7297BD34004602EBEC Priv: 0x49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5E00034C1010000123

Done: Total time 02:24

kpot87 avatar Apr 23 '20 19:04 kpot87

Kangaroo v1.2 Start:49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5E0000000000000000 Stop:49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5EFFFFFFFFFFFFFFFF Keys :35 Number of CPU thread: 0 Range width: 2^105 Number of random walk: 2^20.17 (Max DP=30) DP size: 30 [0xFFFFFFFC00000000] GPU: GPU #0 GeForce GTX 1060 3GB (9x128 cores) Grid(72x128) (117.0 MB used) SolveKeyGPU Thread GPU#0: creating kangaroos... SolveKeyGPU Thread GPU#0: 2^20.17 kangaroos in 7085.6ms [128.92 MKey/s][GPU 128.92 MKey/s][Count 2^40.59][Dead 0][03:56:29][1.3MB]

-g set to 72,128 because I wanted a size closer to a square (for whatever reason, it seems right to do so in my mind) Without manually setting, it defaults to 18x256. Speed still holds between 128 and 131.

No solves as yet because of the range size, but that's fine.

djarumlights avatar Apr 23 '20 21:04 djarumlights

Have somebody tried with Tesla V100?

PS. Krot87, for benchmark better make not just 2 tests, but at least 16, and not for 64bit key but for wider range. Your 2 results 1:38min and 20sec are very different. The same is for speed 200.6Mkey/sec and 372.63MKey/sec... So what is the benchmark speed 200 or 372? (difference almost 2 times). What is the total time for 64bit key - 1:38min or 20sec? (difference more than 5 times). You created a good post, but started with not good example, so nobody followed you...

MrFreeDragon avatar Apr 27 '20 13:04 MrFreeDragon

Have somebody tried with Tesla V100?

PS. Krot87, for benchmark better make not just 2 tests, but at least 16, and not for 64bit key but for wider range. Your 2 results 1:38min and 20sec are very different. The same is for speed 200.6Mkey/sec and 372.63MKey/sec... So what is the benchmark speed 200 or 372? (difference almost 2 times). What is the total time for 64bit key - 1:38min or 20sec? (difference more than 5 times). You created a good post, but started with not good example, so nobody followed you...

Hi, I understand this but I take example from the readme, for good adequately result all must make a same test. Make it please and I will remake result. As for now I have 850Mk/s on difficult 2^96. I would like to see result of GTX 1080TI and compare with RTX2080.

kpot87 avatar Apr 27 '20 15:04 kpot87

The speed for 2080ti was 1018 MKey/sec, but with v1.3 increased to 1331MKey/sec:

Kangaroo v1.3 Start:49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB4800000000000000000000 Stop :49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48FFFFFFFFFFFFFFFFFFFF Keys :16 Number of CPU thread: 0 Range width: 2^80 Number of random walk: 2^22.09 (Max DP=15) DP size: 15 [0xFFFE000000000000] GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x256) (417.0 MB used) SolveKeyGPU Thread GPU#0: creating kangaroos... SolveKeyGPU Thread GPU#0: 2^22.09 kangaroos in 25051.2ms [858.84 MKey/s][GPU 858.84 MKey/s][Count 2^42.15][Dead 8][01:15:13][11359.4MB] Key# 0 Pub: 0x0259A3BFDAD718C9D3FAC7C187F1139F0815AC5D923910D516E186AFDA28B221DC Priv: 0x49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5EBB3EF3883C1866D4 [1331.49 MKey/s][GPU 1331.49 MKey/s][Count 2^38.14][Dead 0][04:43][709.3MB]

MrFreeDragon avatar Apr 27 '20 16:04 MrFreeDragon

Speed for 1080ti is 495MKey/sec (almost 500MKey/sec):

Kangaroo v1.3 Start:49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB4800000000000000000000 Stop :49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48FFFFFFFFFFFFFFFFFFFF Keys :16 Number of CPU thread: 0 Range width: 2^80 Number of random walk: 2^20.81 (Max DP=17) DP size: 17 [0xffff800000000000] GPU: GPU #0 GeForce GTX 1080 Ti (28x128 cores) Grid(56x256) (177.0 MB used) SolveKeyGPU Thread GPU#0: creating kangaroos... SolveKeyGPU Thread GPU#0: 2^20.81 kangaroos in 10229.1ms [459.82 MKey/s][GPU 459.82 MKey/s][Count 2^42.05][Dead 2][03:00:56][2649.2MB]
Key# 0 Pub: 0x0259A3BFDAD718C9D3FAC7C187F1139F0815AC5D923910D516E186AFDA28B221DC Priv: 0x49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5EBB3EF3883C1866D4 [491.45 MKey/s][GPU 491.45 MKey/s][Count 2^41.13][Dead 1][01:36:21][1408.3MB]
Key# 1 Pub: 0x02A50FBBB20757CC0E9C41C49DD9DF261646EE7936272F3F68C740C9DA50D42BCD Priv: 0x49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5EB5ABC43BEBAD3207 [464.88 MKey/s][GPU 464.88 MKey/s][Count 2^40.00][Dead 1][43:48][644.8MB]
Key# 2 Pub: 0x0304A49211C0FE07C9F7C94695996F8826E09545375A3CF9677F2D780A3EB70DE3 Priv: 0x49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5E5698AAAB6CAC52B3 [494.19 MKey/s][GPU 494.19 MKey/s][Count 2^41.41][Dead 1][01:56:08][1710.0MB]
Key# 3 Pub: 0x030B39E3F26AF294502A5BE708BB87AEDD9F895868011E60C1D2ABFCA202CD7A4D Priv: 0x49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5E59C839258C2AD7A0 [494.33 MKey/s][GPU 494.33 MKey/s][Count 2^42.67][Dead 9][04:29:05][4074.8MB]
Key# 4 Pub: 0x02837A31977A73A630C436E680915934A58B8C76EB9B57A42C3C717689BE8C0493 Priv: 0x49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5E765FB411E63B92B9

MrFreeDragon avatar Apr 27 '20 16:04 MrFreeDragon

The speed for 2080ti was 1018 MKey/sec, but with v1.3 increased to 1331MKey/sec:

Kangaroo v1.3 Start:49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB4800000000000000000000 Stop :49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48FFFFFFFFFFFFFFFFFFFF Keys :16 Number of CPU thread: 0 Range width: 2^80 Number of random walk: 2^22.09 (Max DP=15) DP size: 15 [0xFFFE000000000000] GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(136x256) (417.0 MB used) SolveKeyGPU Thread GPU#0: creating kangaroos... SolveKeyGPU Thread GPU#0: 2^22.09 kangaroos in 25051.2ms [858.84 MKey/s][GPU 858.84 MKey/s][Count 2^42.15][Dead 8][01:15:13][11359.4MB] Key# 0 Pub: 0x0259A3BFDAD718C9D3FAC7C187F1139F0815AC5D923910D516E186AFDA28B221DC Priv: 0x49DCCFD96DC5DF56487436F5A1B18C4F5D34F65DDB48CB5EBB3EF3883C1866D4 [1331.49 MKey/s][GPU 1331.49 MKey/s][Count 2^38.14][Dead 0][04:43][709.3MB]

1300Mk/s in stock? Without overclock? Try with different -g param it must be higher

kpot87 avatar Apr 27 '20 19:04 kpot87

Have anybody test speed on RTX 4000 or 6000 ?

kpot87 avatar Apr 29 '20 17:04 kpot87

Have anybody test speed on RTX 4000 or 6000 ?

Are those even accessible yet?!? I imagine they'll be pretty intense on this.

At any rate, with 1.4b, I've gone up to 160MK/s from 140-150 (this is using a 54x256 grid on a 1060 3gb), so the optimizations made in today's version are definitely noticeable! Still ain't found anything, but I expect that because I'm searching the higher-end bits of the puzzle.

Interesting side note: if I have my BetterDiscord client open on my 2nd monitor, speed falls off to to 140MK/s. Minimize it, and speed ramps back up. Facebook, LinkedIn, other general tabs open in Chrome have little to no effect on the speed. I had no idea Discord chewed up that much GPU power!

djarumlights avatar Apr 30 '20 21:04 djarumlights

Here is small comparison table: https://bitcointalk.org/index.php?topic=5244940.msg54359894#msg54359894

MrFreeDragon avatar May 04 '20 13:05 MrFreeDragon

Zotac 1080 mini gets 400ish/sec after it stabilizes. Gets up to 77+ degrees pretty quickly too. Downclocked by 125 and set PL to 80. Running 305 MK/s at 73 degrees.

Noticed something weird today.

So I decided it was time for dusting. Popped PC open, cleaned, then elected to throw said 1080 mini inside as well (moved my 1060 3gb down a slot).

Upon restart and firing kangaroo back up, I noticed that 1060's clock speed/temp ramped WAY up higher than when it was in there by itself. Sped up to about 190 MK/s too. I found it odd though that suddenly it was responding quite differently. Not sure if that's driver-based or something in Kangaroo that caused it to act differently.

djarumlights avatar May 05 '20 01:05 djarumlights

2x GeForce RTX 2080 SUPER (48x64 cores) (Cap 7.5) (7982.3 MB) -> 2x 993 MK/s => 1985 MK/s 2x GeForce GTX 1080 Ti (28x128 cores) (Cap 6.1) (11175.4 MB) -> 2x 533 MK/s => 1067 MK/s

PatatasFritas avatar May 28 '20 06:05 PatatasFritas

Why is my GTX 1060 6 GB speed 7-8 MH / s? 2 x GTX 1060 6 GB is 15 MH / s how to fix it?

Shadow145-cpu avatar Jun 11 '20 06:06 Shadow145-cpu

HI, I don't know could you send the ouput of the program ?

JeanLucPons avatar Jun 12 '20 08:06 JeanLucPons

HI, I don't know could you send the ouput of the program ?

@

HI, I don't know could you send the ouput of the program ?

@JeanLucPons I'm currently away from home. I think I know where the problem is. I compiled Kangaroo in debug mode, in the evening I will check in release mode.

Problem solved, I compiled for release and it's OK

Shadow145-cpu avatar Jun 12 '20 09:06 Shadow145-cpu

Does anyone know what the speed of Nvidia Tesla K20x is?

Shadow145-cpu avatar Jun 12 '20 10:06 Shadow145-cpu

GPU: GPU #0 NVIDIA GeForce RTX 4060 Ti (34x0 cores) Grid(68x128) (92.0 MB used) SolveKeyGPU Thread GPU#0: creating kangaroos... SolveKeyGPU Thread GPU#0: 2^20.09 kangaroos [8.1s] [625.56 MK/s][GPU 558.20 MK/s][Count 2^37.24][Dead 0][04:53 (Avg 1012.5y)][2.0/4.0MB]

GonzoTheDev avatar Jan 09 '24 09:01 GonzoTheDev

Benchmark running with 4 x NVIDIA RTX 3090 24GB

GPU: GPU #1 NVIDIA GeForce RTX 3090 (82x0 cores) Grid(164x128) (212.0 MB used) SolveKeyGPU Thread GPU#1: creating kangaroos... GPU: GPU #3 NVIDIA GeForce RTX 3090 (82x0 cores) Grid(164x128) (212.0 MB used) SolveKeyGPU Thread GPU#3: creating kangaroos... GPU: GPU #0 NVIDIA GeForce RTX 3090 (82x0 cores) Grid(164x128) (212.0 MB used) SolveKeyGPU Thread GPU#0: creating kangaroos... GPU: GPU #2 NVIDIA GeForce RTX 3090 (82x0 cores) Grid(164x128) (212.0 MB used) SolveKeyGPU Thread GPU#2: creating kangaroos... SolveKeyGPU Thread GPU#1: 2^21.36 kangaroos [15.2s] SolveKeyGPU Thread GPU#3: 2^21.36 kangaroos [15.3s] SolveKeyGPU Thread GPU#0: 2^21.36 kangaroos [15.3s] SolveKeyGPU Thread GPU#2: 2^21.36 kangaroos [15.3s] [8974.16 MK/s][GPU 8974.16 MK/s][Count 2^41.14][Dead 0][05:08 (Avg 198.385y)][2.0/4.0MB]

joelqc1 avatar Mar 13 '24 11:03 joelqc1