CudaMiner
CudaMiner copied to clipboard
"now compiling for compute 1.0" does not give higher results on Fermi on Ubuntu 12.04
Have GeForce GTX 560, ubuntu 12.04 and Cuda 5.5. If i change Makefile manually to use -arch=compute_20
then hashrate goes up from approx 95 khash/s
to about 115 khash/s
. So maybe it worth it to tune this?
Both versions run with -i 1 -H 1 -l F11x8 -C 1
Spot-on @biozshock !
I compiled the latest commit with -arch=sm_21
which is equivalent to -arch=compute_21 -code=compute_21,sm_21
instead of the default -arch=compile_10
and my GTX 550 Ti is now displaying the best hashrates ever. Now it also autotunes well. I consider this a continuation of #84
Maybe we can get the best of both worlds and compile for both architectures (compute_10 and sm_20/sm_21) in parallel, so Legacy users can keep using this kernel.
does it make any difference whether one chooses compute_21 or compute_20 in performance?
2014-02-13 1:22 GMT+01:00 Vasco Flores [email protected]:
Spot-on biozshock ! I compiled the latest commit with -arch=sm_21 which is equivalent to -arch=compute_21 -code=compute_21,sm_21 instead of the default -arch=compile_10and my GTX 550 Ti is now displaying the best hashrates ever. Now it also autotunes well. I consider this a continuation of #84https://github.com/cbuchner1/CudaMiner/issues/84
Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/96#issuecomment-34935312 .
Compiling now with -arch=sm_21
as there are no -arch=compute_21
. As i understand nv_kernel.cu
and nv_kernel2.cu
are for kepler and titan, right?
Got an error: Too big maxrregcount value specified 64, will be ignored
as per doc there are will be no max for these..
Will get back after it runs at least 30-60 minutes.
EDIT: Seems like it's doing a bit better if i set -arch=sm_21
instead of -arch=compute_20
. Hashrate almost didn't go up, but it's much more stable. But it's probably because maxrregcount
was ignored.
I didn't use -arch=sm_21
for some special reason, just to what my card seemed most fit and checking the syntax at the cuda doc. I may try other settings when I got the time to, if that helps.
Hm -arch=compute_21 -code=compute_21,sm_21
gives an error here: nvcc fatal : Value 'compute_21' is not defined for option 'gpu-architecture'
what cuda do you use to get that?
@biozshock you are right indeed there is no compute_21 defined I guess -arch=sm_21
is a shorthand to -arch=compute_20 -code=compute_20,sm_21
then. I never really went much through on cuda programming anyway :P.
❯ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2013 NVIDIA Corporation
Built on Wed_Jul_17_18:36:13_PDT_2013
Cuda compilation tools, release 5.5, V5.5.0
Makefile:1019
# NOTE: now compiling for compute 1.0 again, as it's using less power and runs way faster on Linux
fermi_kernel.o: fermi_kernel.cu
$(NVCC) -g -O2 -Xptxas "-abi=no -v" -arch=sm_21 --maxrregcount=64 $(JANSSON_INCLUDES) -o $@ -c $<
I got a slight performance improvement with this on my GTX 570. Went from 240KH/s to 244KH/s, maybe it makes a bigger difference with smaller launch configurations, but of course I will take the ~4KH/s gain.
Thanks!