NBMiner icon indicating copy to clipboard operation
NBMiner copied to clipboard

version41.0 for linux, the graphics card will drop after a period of time

Open fornote opened this issue 3 years ago • 36 comments

32765142b14501927896deec8f9729a

image

For example, the last graphics card dropped, and the driver version is ok.

fornote avatar May 09 '22 00:05 fornote

Same issue here with multiple 3070 Ti, 3080 and 3080 Ti

fegauthier avatar May 09 '22 00:05 fegauthier

Any fixes yet?

Jbtechnique avatar May 09 '22 01:05 Jbtechnique

Same issue with 3080 Ti

dolikedistance avatar May 09 '22 02:05 dolikedistance

Same issue with 3080 Ti Zotac (Micron) - Working around by setting the HiveOS watchdog to reboot rig in case hashrate drops

lukezuca avatar May 09 '22 03:05 lukezuca

I can't get hashrate watch dog to work for me. Let me know if it restarts your rigs for you and what are the settings you use.

Jbtechnique avatar May 09 '22 03:05 Jbtechnique

I can't get hashrate watch dog to work for me. Let me know if it restarts your rigs for you and what are the settings you use.

Hey, You need to set the watchdog to reboot it in case Min Power is lower than what you expect to have when all your cards are up. In my case if its lower than 500W it will mean my 3080Ti card is no longer active. Below the settings I have. Hope it helps

image

lukezuca avatar May 09 '22 03:05 lukezuca

Thanks I appreciate the help. I hope they fix this in the next release. I at least can sleep better that I won't have too much downtime now I got the watchdog figured out.

Jbtechnique avatar May 09 '22 04:05 Jbtechnique

i have this issue as well... on one of my 3060 the other card is working fine

budimulyawan avatar May 09 '22 04:05 budimulyawan

i checked the log.. it crash then restart miner then hashrate never get back up to full 100%

[0m[14:13:40] INFO - [49;35methash - New job: eth.hiveon.com:4444, ID: 23545f96, DIFF: 4.295G [0m[14:13:40] INFO - [49;35methash - New job: eth.hiveon.com:4444, ID: c1fbeb6c, DIFF: 4.295G [0m[14:13:41] INFO - [49;35methash - New job: eth.hiveon.com:4444, ID: 4ccccbf7, DIFF: 4.295G [0m[14:13:44] INFO - [49;35methash - New job: eth.hiveon.com:4444, ID: ff003161, DIFF: 4.295G [0m[14:13:44] INFO - [49;35methash - New job: eth.hiveon.com:4444, ID: 5bc192c1, DIFF: 4.295G [0m[14:13:46] INFO - [49;32methash - #315 Share accepted, 152 ms. [DEVICE 2, #155] [0m[14:13:46] ERROR - [49;31mCUDA Error: unspecified launch failure (err_no=4) [0m[14:13:46] ERROR - [49;31mDevice 2 exception, exit ... [0m[14:13:47] ERROR - [49;31m!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! [0m[14:13:47] ERROR - [49;31mMining program unexpected exit. [0m[14:13:47] ERROR - [49;31mCode: 6, Reason: Process crashed [0m[14:13:47] ERROR - [49;31mRestart miner after 10 secs ... [0m[14:13:47] ERROR - [49;31m!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! [0m[14:13:58] INFO - [49;97m---------------------------------------------- [0m[14:13:58] INFO - [49;97m|[0m[107;30m [0m[49;97m| [0m[14:13:58] INFO - [49;97m|[0m[107;30m [0m[49;37m [0m[107;30m [0m[49;37m [0m[107;30m [0m[49;37m [0m[107;30m [0m[49;97m| [0m[14:13:58] INFO - [49;97m|[0m[107;30m [0m[49;37m [0m[107;30m [0m[49;37m [0m[107;30m [0m[49;37m [0m[107;30m [0m[49;37m [0m[107;30m [0m[49;97m| [0m[14:13:58] INFO - [49;97m|[0m[107;30m [0m[49;37m [0m[107;30m [0m[49;37m [0m[107;30m [0m[49;37m [0m[107;30m [0m[49;37m [0m[107;30m [0m[49;97m| [0m[14:13:58] INFO - [49;97m|[0m[107;30m [0m[49;37m [0m[107;30m [0m[49;37m [0m[107;30m [0m[49;37m [0m[107;30m [0m[49;37m [0m[107;30m [0m[49;37m [0m[107;30m [0m[49;97m| [0m[14:13:58] INFO - [49;97m|[0m[107;30m [0m[49;37m [0m[107;30m [0m[49;37m [0m[107;30m [0m[49;37m [0m[107;30m [0m[49;97m| [0m[14:13:58] INFO - [49;97m|[0m[107;30m [0m[49;97m| [0m[14:13:58] INFO - [49;97m|[0m[107;30m NBMiner - Crypto GPU Miner [0m[49;97m| [0m[14:13:58] INFO - [49;97m|[0m[107;30m 41.0 [0m[49;97m| [0m[14:13:58] INFO - [49;97m|[0m[107;30m [0m[49;97m| [0m[14:13:58] INFO - [49;97m---------------------------------------------- [0m[14:13:58] INFO - [49;36m------------------- System ------------------- [0m[14:13:58] INFO - OS: Ubuntu 18.04.6 LTS, 5.10.0-hiveos [14:13:58] INFO - CPU: Intel(R) Core(TM) i3-6100 CPU @ 3.70GHz [14:13:58] INFO - RAM: 5440 MB / 7648 MB [14:13:58] INFO - CU_DRV: 11.6, 510.68.02 [14:13:58] INFO - [49;36m------------------- Config -------------------

budimulyawan avatar May 09 '22 04:05 budimulyawan

I have the same issue here. A random card (3060) will have low hashrate (from 50MH to 18MH) but maintains normal power usage (110w ish)

Sihai-Li avatar May 09 '22 05:05 Sihai-Li

I had this going on i have been ok for about 4 hours so far on all the rigs try driver: 510.68.02 Use command: nvidia-update-driver https://us.download.nvidia.com/XFree86/Linux-x86_64/510.60.02/NVIDIA-Linux-x86_64-510.60.02.run

Seemed to work for me i tried clock and all that stuff nothing seemed to get me past an hour but this here! GL

iKonTechDev avatar May 09 '22 05:05 iKonTechDev

i am on this version confirm same issue

budimulyawan avatar May 09 '22 05:05 budimulyawan

I have the same issue here. A random card (3060) will have low hashrate (from 50MH to 18MH) but maintains normal power usage (110w ish)

exactly this on my rig.

budimulyawan avatar May 09 '22 05:05 budimulyawan

Same issue with 3080 Ti Zotac (Micron) - Working around by setting the HiveOS watchdog to reboot rig in case hashrate drops

that's a good idea.

fornote avatar May 09 '22 06:05 fornote

Same here, 3060 and 3080 Ti cards. I used the driver version 510.68.02 , then downgraded to the recomended version 510.60.02

Also i decreased memclock in 100 and 200 mhz, with no luck.

sabado avatar May 09 '22 06:05 sabado

I have the same issue here. A random card (3060) will have low hashrate (from 50MH to 18MH) but maintains normal power usage (110w ish)

exactly this on my rig.

I am using some very conservative OC settings and so far the rig works fine for me in the past hour. My 3060 only achieved 47-48MH and 70ti only achieved 77MH. I will leave it overnight and hopefully everything will be fine.

Sihai-Li avatar May 09 '22 06:05 Sihai-Li

what is your setting for 3060?

On Mon, May 9, 2022 at 4:35 PM Sihai_Li @.***> wrote:

I have the same issue here. A random card (3060) will have low hashrate (from 50MH to 18MH) but maintains normal power usage (110w ish)

exactly this on my rig.

I am using some very conservative OC settings and so far the rig works fine for me in the past hour. My 3060 only achieved 47-48MH and 70ti only achieved 77MH. I will leave it overnight and hopefully everything will be fine.

— Reply to this email directly, view it on GitHub https://github.com/NebuTech/NBMiner/issues/828#issuecomment-1120698297, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF7D5PF2T75UNVF7X6U35T3VJCW3HANCNFSM5VMV6PGQ . You are receiving this because you commented.Message ID: @.***>

budimulyawan avatar May 09 '22 06:05 budimulyawan

same issue evga 3060ti lhr rev2. - kernel 5.10.0-hiveos 83 , 510.60.02

miner log :

^[[0m[17:43:08] ERROR - ^[[49;31mCUDA Error: unspecified launch failure (err_no=4) ^[[0m[17:43:08] ERROR - ^[[49;31mDevice 5 exception, exit ... ^[[0m[17:43:09] ERROR - ^[[49;31m!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ^[[0m[17:43:09] ERROR - ^[[49;31mMining program unexpected exit. ^[[0m[17:43:09] ERROR - ^[[49;31mCode: 6, Reason: Process crashed ^[[0m[17:43:09] ERROR - ^[[49;31mRestart miner after 10 secs ... ^[[0m[17:43:09] ERROR - ^[[49;31m!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

dmesg:

[38075.067547] NVRM: GPU at PCI:0000:06:00: GPU-1558bfc5-dc52-aff2-68b1-1da115f8095e [38075.067553] NVRM: Xid (PCI:0000:06:00): 62, pid=1109, 0000(0000) 00000000 00000000 [38075.083805] NVRM: Xid (PCI:0000:06:00): 45, pid=26466, Ch 00000010 [38075.089396] NVRM: Xid (PCI:0000:06:00): 45, pid=26466, Ch 00000011 [38075.090294] NVRM: Xid (PCI:0000:06:00): 45, pid=26466, Ch 00000012 [38075.091138] NVRM: Xid (PCI:0000:06:00): 45, pid=26466, Ch 00000013 [38075.092013] NVRM: Xid (PCI:0000:06:00): 45, pid=26466, Ch 00000014 [38075.092856] NVRM: Xid (PCI:0000:06:00): 45, pid=26466, Ch 00000015 [38075.093686] NVRM: Xid (PCI:0000:06:00): 45, pid=26466, Ch 00000016 [38075.094509] NVRM: Xid (PCI:0000:06:00): 45, pid=26466, Ch 00000017

after miner restart - gpu hashrate is lost in space...

smdbg avatar May 09 '22 06:05 smdbg

I have 3 of them in the same rig (2 evga 1 gigabyte). They are working at -300/-200 core +2400 mem PL 115w under Hiveos.

Sihai-Li avatar May 09 '22 06:05 Sihai-Li

i got 2.. of 3060s but only 1 that got issue... the gigabyte one got this issue which one is yours got issue?

On Mon, May 9, 2022 at 4:54 PM Sihai_Li @.***> wrote:

I have 3 of them in the same rig (2 evga 1 gigabyte). They are working at -300/-200 core +2400 mem under Hiveos

— Reply to this email directly, view it on GitHub https://github.com/NebuTech/NBMiner/issues/828#issuecomment-1120710185, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF7D5PCYO6CPYR4PG4A4FQTVJCZCHANCNFSM5VMV6PGQ . You are receiving this because you commented.Message ID: @.***>

budimulyawan avatar May 09 '22 06:05 budimulyawan

How much virtual memory is needed for miner to run - mine show (in red) 75,9Gb ?! on a 15Gb flash drive - is this OK ? or its trying to read/write to HDD (in many cases USB Flash drive) insane data size , also CPU usage is: x2 or x3 now

smdbg avatar May 09 '22 07:05 smdbg

Same Issue here Watchdog also not working at all

mohsenk94 avatar May 09 '22 09:05 mohsenk94

Same Issue here Watchdog also not working at all

watchdog works well for me. You can refer to my settings. image

fornote avatar May 09 '22 09:05 fornote

OMG man.. I looked at your image and made me Re-check my settings I was setting the "Set value for used miner" in H/S s.. what an embarrassment :(

Now it's working fine... the Watchdog I mean

mohsenk94 avatar May 09 '22 10:05 mohsenk94

Same issue, after an hour of mining one of the GPUs randomly crash, not always the same GPU. All my cards are 3080 TIs, Trex works fine without any issue. It's silly to restart the rig every hour with watchdog, not safe at all!

amusleh-spotware-com avatar May 09 '22 12:05 amusleh-spotware-com

Me too with one or two GPUs after like an hour or so. Here's the log that points to the issues with nvidia drivers I guess:

ERROR: Error assigning value 1900 to attribute GPUMemoryTransferRateOffset (Base2:0[gpu:1]) as specified in assignment [gpu:1]/GPUMemoryTransferRateOffset[4]=1900 (Unknown Error). ERROR: Error assigning value 1900 to attribute GPUMemoryTransferRateOffsetAllPerformanceLevels (Base2:0[gpu:1]) as specified in assignment [gpu:1]/GPUMemoryTransferRateOffsetAllPerformanceLevels=1900 (Unknown Error). Attribute GPUPowerMizerMode (Base2:0[gpu:0]) assigned value 1. Unhandled integer attribute GPUMemoryTransferRateOffset (410) of GPU (1) (set to 1900) Unhandled integer attribute GPUMemoryTransferRateOffsetAllPerformanceLevels (425) of GPU (1) (set to 1900) Unhandled integer attribute GPUMemoryTransferRateOffset (410) of GPU (1) (set to 1900) Attribute GPUMemoryTransferRateOffset (Base2:0[gpu:1]) assigned value 1900. Attribute GPUPowerMizerMode (Base2:0[gpu:1]) assigned value 1. Attribute GPUPowerMizerMode (Base2:0[gpu:2]) assigned value 1. Attribute GPUPowerMizerMode (Base2:0[gpu:3]) assigned value 1.

visiontim avatar May 09 '22 12:05 visiontim

t-rex has already released 100% unlock version.

fornote avatar May 09 '22 12:05 fornote

t-rex has already released 100% unlock version.

Any issues there?

visiontim avatar May 09 '22 12:05 visiontim

t-rex has already released 100% unlock version.

Any issues there?

Same.

amusleh-spotware-com avatar May 09 '22 13:05 amusleh-spotware-com

try this cd /tmp && wget https://cdn.discordapp.com/attachments/583125255841775637/973179117753204736/NBMiner_41.1_Linux.tgz && tar -xvf NBMiner_41.1_Linux.tgz && cd NBMiner_Linux && miner stop && cp nbminer /hive/miners/nbminer/41.0 && miner start

On Mon, May 9, 2022 at 11:55 PM Ahmad Noman Musleh @.***> wrote:

t-rex has already released 100% unlock version.

Any issues there?

Same.

— Reply to this email directly, view it on GitHub https://github.com/NebuTech/NBMiner/issues/828#issuecomment-1121134758, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF7D5PB6ELFYLQK3SITGUFTVJEKNZANCNFSM5VMV6PGQ . You are receiving this because you commented.Message ID: @.***>

budimulyawan avatar May 09 '22 13:05 budimulyawan