CudaMiner "now compiling for compute 1.0" does not give higher results on Fermi on Ubuntu 12.04

Have GeForce GTX 560, ubuntu 12.04 and Cuda 5.5. If i change Makefile manually to use -arch=compute_20 then hashrate goes up from approx 95 khash/s to about 115 khash/s. So maybe it worth it to tune this? Both versions run with -i 1 -H 1 -l F11x8 -C 1

Feb 11 '14 12:02 biozshock

Spot-on @biozshock ! I compiled the latest commit with -arch=sm_21 which is equivalent to -arch=compute_21 -code=compute_21,sm_21 instead of the default -arch=compile_10and my GTX 550 Ti is now displaying the best hashrates ever. Now it also autotunes well. I consider this a continuation of #84

Feb 13 '14 00:02 vxf

Maybe we can get the best of both worlds and compile for both architectures (compute_10 and sm_20/sm_21) in parallel, so Legacy users can keep using this kernel.

does it make any difference whether one chooses compute_21 or compute_20 in performance?

2014-02-13 1:22 GMT+01:00 Vasco Flores [email protected]:

Spot-on biozshock ! I compiled the latest commit with -arch=sm_21 which is equivalent to -arch=compute_21 -code=compute_21,sm_21 instead of the default -arch=compile_10and my GTX 550 Ti is now displaying the best hashrates ever. Now it also autotunes well. I consider this a continuation of #84https://github.com/cbuchner1/CudaMiner/issues/84

Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/96#issuecomment-34935312 .

Feb 13 '14 10:02 cbuchner1

Compiling now with -arch=sm_21 as there are no -arch=compute_21. As i understand nv_kernel.cu and nv_kernel2.cu are for kepler and titan, right?

Got an error: Too big maxrregcount value specified 64, will be ignored as per doc there are will be no max for these..

Will get back after it runs at least 30-60 minutes.

EDIT: Seems like it's doing a bit better if i set -arch=sm_21 instead of -arch=compute_20. Hashrate almost didn't go up, but it's much more stable. But it's probably because maxrregcount was ignored.

Feb 13 '14 11:02 biozshock

I didn't use -arch=sm_21 for some special reason, just to what my card seemed most fit and checking the syntax at the cuda doc. I may try other settings when I got the time to, if that helps.

Feb 13 '14 19:02 vxf

Hm -arch=compute_21 -code=compute_21,sm_21 gives an error here: nvcc fatal : Value 'compute_21' is not defined for option 'gpu-architecture' what cuda do you use to get that?

Feb 13 '14 19:02 biozshock

@biozshock you are right indeed there is no compute_21 defined I guess -arch=sm_21 is a shorthand to -arch=compute_20 -code=compute_20,sm_21 then. I never really went much through on cuda programming anyway :P.

❯ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2013 NVIDIA Corporation
Built on Wed_Jul_17_18:36:13_PDT_2013
Cuda compilation tools, release 5.5, V5.5.0

Makefile:1019

# NOTE: now compiling for compute 1.0 again, as it's using less power and runs way faster on Linux
fermi_kernel.o: fermi_kernel.cu
    $(NVCC) -g -O2 -Xptxas "-abi=no -v" -arch=sm_21 --maxrregcount=64 $(JANSSON_INCLUDES) -o $@ -c $<

Feb 14 '14 13:02 vxf

I got a slight performance improvement with this on my GTX 570. Went from 240KH/s to 244KH/s, maybe it makes a bigger difference with smaller launch configurations, but of course I will take the ~4KH/s gain.

Thanks!

Feb 21 '14 06:02 ImmortalJ

CudaMiner CudaMiner copied to clipboard

"now compiling for compute 1.0" does not give higher results on Fermi on Ubuntu 12.04

CudaMiner
CudaMiner copied to clipboard