CudaMiner
CudaMiner copied to clipboard
Floating point exception
Updated cudaminer with latest commit this morning, and it broke:
[2014-01-12 10:32:07] 1 miner threads started, using 'scrypt' algorithm.
[2014-01-12 10:32:07] Starting Stratum on stratum+tcp://eac.us1.hackshard.com:3333
[2014-01-12 10:32:08] Stratum detected new block
[2014-01-12 10:32:08] GPU #0: GeForce GTX 675M with compute capability 2.1
[2014-01-12 10:32:08] GPU #0: interactive: 1, tex-cache: 1D, single-alloc: 1
[2014-01-12 10:32:08] GPU #0: Performing auto-tuning (Patience...)
[2014-01-12 10:32:08] GPU #0: maximum warps: 502
[2014-01-12 10:32:08] GPU #0: 0.00 khash/s with configuration F0x0
[2014-01-12 10:32:08] GPU #0: using launch configuration F0x0
[1] 29067 floating point exception (core dumped) cudaminer -H 2 -d 0 -i 1,0,0 -l auto,K4x16 -C 1 -o -O
I too have been getting this error since I've updated. I've had to manually play around with configurations to get it working.
Got a backtrace. WARPS_PER_BLOCK is 0 for some reason. I've been having this problem sporadically, too. Git commit e0c7371a1efeb1c2164eddef2430e07cbd3eeae8
$gdb cudaminer GNU gdb (GDB) 7.6.1 (Debian 7.6.1-1) Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/... Reading symbols from /home/dan/projects/CudaMiner/cudaminer...done. (gdb) r -c ~/.config/cudaminer Starting program: /home/dan/projects/CudaMiner/cudaminer -c ~/.config/cudaminer warning: Could not load shared library symbols for linux-vdso.so.1. Do you need "set solib-search-path" or "set sysroot"? [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". *** CudaMiner for nVidia GPUs by Christian Buchner *** This is version 2013-12-18 (beta) based on pooler-cpuminer 2.3.2 (c) 2010 Jeff Garzik, 2012 pooler Cuda additions Copyright 2013 Christian Buchner My donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm
[New Thread 0x7ffff2651700 (LWP 4204)] [New Thread 0x7ffff1e50700 (LWP 4205)] [2014-01-12 00:53:11] Starting Stratum on stratum+tcp://doge.netcodepool.org:4095 [New Thread 0x7ffff164f700 (LWP 4206)] [2014-01-12 00:53:11] 1 miner threads started, using 'scrypt' algorithm. [New Thread 0x7ffff0e4e700 (LWP 4207)] [Thread 0x7ffff0e4e700 (LWP 4207) exited] [2014-01-12 00:53:11] Stratum detected new block [New Thread 0x7ffff0e4e700 (LWP 4208)] [2014-01-12 00:53:12] GPU #0: GeForce GTX 570 with compute capability 2.0 [2014-01-12 00:53:12] GPU #0: interactive: 1, tex-cache: 0 , single-alloc: 0 [2014-01-12 00:53:12] GPU #0: Performing auto-tuning (Patience...) [2014-01-12 00:53:12] GPU #0: maximum warps: 267 [2014-01-12 00:53:12] GPU #0: 0.00 khash/s with configuration F0x0 [2014-01-12 00:53:12] GPU #0: using launch configuration F0x0
Program received signal SIGFPE, Arithmetic exception. [Switching to Thread 0x7ffff164f700 (LWP 4206)] 0x000000000041fa68 in cuda_scrypt_core (thr_id=0, stream=0, N=1024) at salsa_kernel.cu:790 790 dim3 grid(WU_PER_LAUNCH/WU_PER_BLOCK, 1, 1); (gdb) p WARPS_PER_BLOCK $1 = 0 (gdb) p context_wpb[thr_id] Could not find operator[]. (gdb) thread apply all bt
Thread 6 (Thread 0x7ffff0e4e700 (LWP 4208)): #0 0x00007ffff6a1095d in poll () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007ffff2dc7e93 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so #2 0x00007ffff283ea95 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so #3 0x00007ffff2dc9d29 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so #4 0x00007ffff7958e0e in start_thread (arg=0x7ffff0e4e700) at pthread_create.c:311 #5 0x00007ffff6a1c0fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
Thread 4 (Thread 0x7ffff164f700 (LWP 4206)):
#0 0x000000000041fa68 in cuda_scrypt_core (thr_id=0, stream=0, N=1024)
at salsa_kernel.cu:790
#1 0x00000000004163e2 in scanhash_scrypt (thr_id=thr_id@entry=0,
pdata=pdata@entry=0x7ffff164eca0, ptarget=ptarget@entry=0x7ffff164ed20,
max_nonce=max_nonce@entry=4095,
hashes_done=hashes_done@entry=0x7ffff164ebe8) at scrypt.cpp:759
#2 0x0000000000407439 in miner_thread (userdata=
Thread 3 (Thread 0x7ffff1e50700 (LWP 4205)):
#0 0x00007ffff6a14f53 in select () at ../sysdeps/unix/syscall-template.S:81
#1 0x0000000000408818 in socket_full (sock=
Thread 2 (Thread 0x7ffff2651700 (LWP 4204)):
#0 pthread_cond_wait@@GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x000000000040b130 in tq_pop (tq=0x7d2c90, abstime=abstime@entry=0x0)
at util.c:1295
#2 0x0000000000407d4b in workio_thread (userdata=0x7d2c38) at cpu-miner.c:570
#3 0x00007ffff7958e0e in start_thread (arg=0x7ffff2651700)
---Type
Thread 1 (Thread 0x7ffff7fba7c0 (LWP 4200)):
#0 0x00007ffff7959ff8 in pthread_join (threadid=140737251706624,
thread_return=thread_return@entry=0x0) at pthread_join.c:92
#1 0x0000000000403bd1 in main (argc=
Program terminated with signal SIGFPE, Arithmetic exception. The program no longer exists. (gdb) q
I get that too occasionally. Probably the cudaGetLastError() is not cleared before entering the autotune algorithm, making it terminate early.
2014/1/12 dchokola [email protected]
Got a backtrace. WARPS_PER_BLOCK is 0 for some reason. I've been having this problem sporadically, too.
$gdb cudaminer GNU gdb (GDB) 7.6.1 (Debian 7.6.1-1) Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/... Reading symbols from /home/dan/projects/CudaMiner/cudaminer...done. (gdb) r -c ~/.config/cudaminer Starting program: /home/dan/projects/CudaMiner/cudaminer -c ~/.config/cudaminer warning: Could not load shared library symbols for linux-vdso.so.1. Do you need "set solib-search-path" or "set sysroot"? [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". *** CudaMiner for nVidia GPUs by Christian Buchner *** This is version 2013-12-18 (beta) based on pooler-cpuminer 2.3.2 (c) 2010 Jeff Garzik, 2012 pooler Cuda additions Copyright 2013 Christian Buchner My donation address: LKS1WDKGED647msBQfLBHV3Ls8sveGncnm
[New Thread 0x7ffff2651700 (LWP 4204)] [New Thread 0x7ffff1e50700 (LWP 4205)] [2014-01-12 00:53:11] Starting Stratum on stratum+tcp:// doge.netcodepool.org:4095 [New Thread 0x7ffff164f700 (LWP 4206)] [2014-01-12 00:53:11] 1 miner threads started, using 'scrypt' algorithm. [New Thread 0x7ffff0e4e700 (LWP 4207)] [Thread 0x7ffff0e4e700 (LWP 4207) exited] [2014-01-12 00:53:11] Stratum detected new block [New Thread 0x7ffff0e4e700 (LWP 4208)] [2014-01-12 00:53:12] GPU #0: GeForce GTX 570 with compute capability 2.0 [2014-01-12 00:53:12] GPU #0: interactive: 1, tex-cache: 0 , single-alloc: 0 [2014-01-12 00:53:12] GPU #0: Performing auto-tuning (Patience...) [2014-01-12 00:53:12] GPU #0: maximum warps: 267 [2014-01-12 00:53:12] GPU #0: 0.00 khash/s with configuration F0x0 [2014-01-12 00:53:12] GPU #0: using launch configuration F0x0
Program received signal SIGFPE, Arithmetic exception. [Switching to Thread 0x7ffff164f700 (LWP 4206)] 0x000000000041fa68 in cuda_scrypt_core (thr_id=0, stream=0, N=1024) at salsa_kernel.cu:790 790 dim3 grid(WU_PER_LAUNCH/WU_PER_BLOCK, 1, 1); (gdb) p WARPS_PER_BLOCK $1 = 0 (gdb) p context_wpb[thr_id] Could not find operator[]. (gdb) thread apply all bt
Thread 6 (Thread 0x7ffff0e4e700 (LWP 4208)): #0 0x00007ffff6a1095d in poll () at ../sysdeps/unix/syscall-template.S:81 #1 https://github.com/cbuchner1/CudaMiner/issues/1 0x00007ffff2dc7e93 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so #2 https://github.com/cbuchner1/CudaMiner/issues/2 0x00007ffff283ea95 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so #3 https://github.com/cbuchner1/CudaMiner/issues/3 0x00007ffff2dc9d29 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so #4 https://github.com/cbuchner1/CudaMiner/pull/4 0x00007ffff7958e0e in start_thread (arg=0x7ffff0e4e700) at pthread_create.c:311 #5 https://github.com/cbuchner1/CudaMiner/issues/5 0x00007ffff6a1c0fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
Thread 4 (Thread 0x7ffff164f700 (LWP 4206)): #0 0x000000000041fa68 in cuda_scrypt_core (thr_id=0, stream=0, N=1024) at salsa_kernel.cu:790 #1 https://github.com/cbuchner1/CudaMiner/issues/1 0x00000000004163e2 in scanhash_scrypt (thr_id=thr_id@entry=0, pdata=pdata@entry=0x7ffff164eca0, ptarget=ptarget@entry=0x7ffff164ed20, max_nonce=max_nonce@entry=4095, hashes_done=hashes_done@entry=0x7ffff164ebe8) at scrypt.cpp:759 #2 https://github.com/cbuchner1/CudaMiner/issues/2 0x0000000000407439 in miner_thread (userdata=) at cpu-miner.c:820 #3 https://github.com/cbuchner1/CudaMiner/issues/3 0x00007ffff7958e0e in start_thread (arg=0x7ffff164f700) at pthread_create.c:311 #4 https://github.com/cbuchner1/CudaMiner/pull/4 0x00007ffff6a1c0fd in clone () ---Type to continue, or q to quit--- at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
Thread 3 (Thread 0x7ffff1e50700 (LWP 4205)): #0 0x00007ffff6a14f53 in select () at ../sysdeps/unix/syscall-template.S:81 #1 https://github.com/cbuchner1/CudaMiner/issues/1 0x0000000000408818 in socket_full (sock=, timeout=) at util.c:631 #2 https://github.com/cbuchner1/CudaMiner/issues/2 0x00000000004090b3 in stratum_socket_full ( sctx=sctx@entry=0x798460 , timeout=timeout@entry=120) at util.c:638 #3 https://github.com/cbuchner1/CudaMiner/issues/3 0x0000000000407793 in stratum_thread (userdata=) at cpu-miner.c:1051 #4 https://github.com/cbuchner1/CudaMiner/pull/4 0x00007ffff7958e0e in start_thread (arg=0x7ffff1e50700) at pthread_create.c:311 #5 https://github.com/cbuchner1/CudaMiner/issues/5 0x00007ffff6a1c0fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
Thread 2 (Thread 0x7ffff2651700 (LWP 4204)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 https://github.com/cbuchner1/CudaMiner/issues/1 0x000000000040b130 in tq_pop (tq=0x7d2c90, abstime=abstime@entry=0x0) at util.c:1295 #2 https://github.com/cbuchner1/CudaMiner/issues/2 0x0000000000407d4b in workio_thread (userdata=0x7d2c38) at cpu-miner.c:570 #3 https://github.com/cbuchner1/CudaMiner/issues/3 0x00007ffff7958e0e in start_thread (arg=0x7ffff2651700) ---Type to continue, or q to quit--- at pthread_create.c:311 #4 https://github.com/cbuchner1/CudaMiner/pull/4 0x00007ffff6a1c0fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
Thread 1 (Thread 0x7ffff7fba7c0 (LWP 4200)): #0 0x00007ffff7959ff8 in pthread_join (threadid=140737251706624, thread_return=thread_return@entry=0x0) at pthread_join.c:92 #1 https://github.com/cbuchner1/CudaMiner/issues/1 0x0000000000403bd1 in main (argc=, argv=) at cpu-miner.c:1638 (gdb) c Continuing. [Thread 0x7ffff0e4e700 (LWP 4208) exited] [Thread 0x7ffff164f700 (LWP 4206) exited] [Thread 0x7ffff1e50700 (LWP 4205) exited] [Thread 0x7ffff2651700 (LWP 4204) exited]
Program terminated with signal SIGFPE, Arithmetic exception. The program no longer exists. (gdb) q
— Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/64#issuecomment-32116124 .
Any idea how to fix this yet?
Is the problem gone now? I have made some code changes after running into this problem myself...
2014/1/16 Jamone Kelly [email protected]
Any idea how to fix this yet?
— Reply to this email directly or view it on GitHubhttps://github.com/cbuchner1/CudaMiner/issues/64#issuecomment-32428110 .
I pulled this morning and the problem is gone for me now. Thanks!
Still here for me. Although only on the kepler gpu.
edit: no somehow its working now.
Just cloned the latest commit and is not working.
Still getting the floating point exceptions on 142261bb89144364873ab4c70772a96d16647966
Update: fixed for me with a more recent checkin.