Autolykos-GPU-miner icon indicating copy to clipboard operation
Autolykos-GPU-miner copied to clipboard

My NVIDIA GPU is not recognized - deviceCount is 0

Open bjenkinsgit opened this issue 5 years ago • 8 comments

A month or so ago, Autolykos miner compiled and ran. Now suddenly it doesn't work because the .cu files don't recognize any installed NVIDIA GPU. But when I install and make all of the CUDA examples, THOSE run just fine. Running on UBUNTU 18.04 with a GeForce TITAN X. For example, the utility "deviceQuery" returns the following: ./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX TITAN X" CUDA Driver Version / Runtime Version 10.1 / 10.1 CUDA Capability Major/Minor version number: 5.2 Total amount of global memory: 12210 MBytes (12802785280 bytes) (24) Multiprocessors, (128) CUDA Cores/MP: 3072 CUDA Cores GPU Max Clock rate: 1076 MHz (1.08 GHz) Memory Clock rate: 3505 Mhz Memory Bus Width: 384-bit L2 Cache Size: 3145728 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 2 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device supports Compute Preemption: No Supports Cooperative Kernel Launch: No Supports MultiDevice Co-op Kernel Launch: No Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.1, CUDA Runtime Version = 10.1, NumDevs = 1 Result = PASS

bjenkinsgit avatar Oct 03 '19 17:10 bjenkinsgit

Hey, What error exactly does miner give you? Are you using latest version?

rsmmnt avatar Oct 09 '19 04:10 rsmmnt

also, do you have CUDA_VISIBLE_DEVICES env variable set?

rsmmnt avatar Oct 09 '19 04:10 rsmmnt

Ok. I’m a bit puzzled. It is now working. The miner binary now recognizes that I have 1 GPU and starts running. I did remake the CUDA toolkit and rebooted a few times since trying again. But at the time I left it, the miner was not running. It has now been a few days and it seems to be working again. I thought maybe it had something to do with the fact that some areas of the documentation mention that the config.json for the miner has to have the mnemonic phrase key listed as “mnemonic” and in other docs I’ve seen it as “seed”. But either entry allows the miner to run (which is the correct key, “mnemonic” or “seed” ?)

Now, from the miner, after a minute or so of processing, I am getting a lot of “ERROR: 500, REASON: INTERNAL ERROR” errors and the console for my ergo client (ergo-3.10.jar) is throwing lots of errors in the form of:

WARN [ctor.default-dispatcher-xxx] o.e.local.ErgoMiner - Failed to produce candidate block with a java error of: java.lang.Error: Trying to generate proof for empty transaction sequence….

Any ideas on what is happening now?

Thanks

Bart

On Oct 9, 2019, at 12:53 AM, rsmmnt [email protected] wrote:

Hey, What error exactly does miner give you? Are you using latest version?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ergoplatform/Autolykos-GPU-miner/issues/60?email_source=notifications&email_token=ACBCYQPW2K5D33KS2G4TF4DQNVPWDA5CNFSM4I5FXXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAWSGMY#issuecomment-539829043, or mute the thread https://github.com/notifications/unsubscribe-auth/ACBCYQIOBF4QSPGECQJTN2TQNVPWDANCNFSM4I5FXXTQ.

bjenkinsgit avatar Oct 09 '19 15:10 bjenkinsgit

I do not. The only CUDA related env variables I have set must be ones from making and installing the cuda libs and examples, specifically:

CUDA_BIN=/usr/local/cuda-10.1/bin CUDA_HOME=/usr/local/cuda-10.1 CUDA_NSIGHT=/usr/local/cuda-10.1/NsightCompute-2019.1

But, the miner binary is recognizing my GPU now. I’m getting errors from mining, but that is a different problem...

On Oct 9, 2019, at 12:54 AM, rsmmnt [email protected] wrote:

also, do you have CUDA_VISIBLE_DEVICES env variable set?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ergoplatform/Autolykos-GPU-miner/issues/60?email_source=notifications&email_token=ACBCYQM3K2D6E4Q4BX22WQTQNVPZBA5CNFSM4I5FXXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAWSHWQ#issuecomment-539829210, or mute the thread https://github.com/notifications/unsubscribe-auth/ACBCYQJTCSIGYOHHDOCTQC3QNVPZBANCNFSM4I5FXXTQ.

bjenkinsgit avatar Oct 09 '19 15:10 bjenkinsgit

I forgot to add that, in the miner error output, the DETAIL info on the 500 error says “requirement failed: Incorrect points"

On Oct 9, 2019, at 11:32 AM, Bart Jenkins [email protected] wrote:

Ok. I’m a bit puzzled. It is now working. The miner binary now recognizes that I have 1 GPU and starts running. I did remake the CUDA toolkit and rebooted a few times since trying again. But at the time I left it, the miner was not running. It has now been a few days and it seems to be working again. I thought maybe it had something to do with the fact that some areas of the documentation mention that the config.json for the miner has to have the mnemonic phrase key listed as “mnemonic” and in other docs I’ve seen it as “seed”. But either entry allows the miner to run (which is the correct key, “mnemonic” or “seed” ?)

Now, from the miner, after a minute or so of processing, I am getting a lot of “ERROR: 500, REASON: INTERNAL ERROR” errors and the console for my ergo client (ergo-3.10.jar) is throwing lots of errors in the form of:

WARN [ctor.default-dispatcher-xxx] o.e.local.ErgoMiner - Failed to produce candidate block with a java error of: java.lang.Error: Trying to generate proof for empty transaction sequence….

Any ideas on what is happening now?

Thanks

Bart

On Oct 9, 2019, at 12:53 AM, rsmmnt <[email protected] mailto:[email protected]> wrote:

Hey, What error exactly does miner give you? Are you using latest version?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ergoplatform/Autolykos-GPU-miner/issues/60?email_source=notifications&email_token=ACBCYQPW2K5D33KS2G4TF4DQNVPWDA5CNFSM4I5FXXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAWSGMY#issuecomment-539829043, or mute the thread https://github.com/notifications/unsubscribe-auth/ACBCYQIOBF4QSPGECQJTN2TQNVPWDANCNFSM4I5FXXTQ.

bjenkinsgit avatar Oct 09 '19 15:10 bjenkinsgit

I’ll bet what happened is this:

I play games on this linux box and I might have left a game up and running,

…OR...

When exiting the video game I was playing, it did not release the GPU for some reason

I’ll try to reproduce this by running a game, leaving it running and then trying to start the miner…I’ll report back if that is the problem.

Is there some CUDA command (for the .cu file) to check for a GPU being “in-use” rather than just saying NO DEVICES FOUND? That would be a more meaningful error, no?

Thanks

On Oct 9, 2019, at 11:32 AM, Bart Jenkins [email protected] wrote:

Ok. I’m a bit puzzled. It is now working. The miner binary now recognizes that I have 1 GPU and starts running. I did remake the CUDA toolkit and rebooted a few times since trying again. But at the time I left it, the miner was not running. It has now been a few days and it seems to be working again. I thought maybe it had something to do with the fact that some areas of the documentation mention that the config.json for the miner has to have the mnemonic phrase key listed as “mnemonic” and in other docs I’ve seen it as “seed”. But either entry allows the miner to run (which is the correct key, “mnemonic” or “seed” ?)

Now, from the miner, after a minute or so of processing, I am getting a lot of “ERROR: 500, REASON: INTERNAL ERROR” errors and the console for my ergo client (ergo-3.10.jar) is throwing lots of errors in the form of:

WARN [ctor.default-dispatcher-xxx] o.e.local.ErgoMiner - Failed to produce candidate block with a java error of: java.lang.Error: Trying to generate proof for empty transaction sequence….

Any ideas on what is happening now?

Thanks

Bart

On Oct 9, 2019, at 12:53 AM, rsmmnt <[email protected] mailto:[email protected]> wrote:

Hey, What error exactly does miner give you? Are you using latest version?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ergoplatform/Autolykos-GPU-miner/issues/60?email_source=notifications&email_token=ACBCYQPW2K5D33KS2G4TF4DQNVPWDA5CNFSM4I5FXXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAWSGMY#issuecomment-539829043, or mute the thread https://github.com/notifications/unsubscribe-auth/ACBCYQIOBF4QSPGECQJTN2TQNVPWDANCNFSM4I5FXXTQ.

bjenkinsgit avatar Oct 09 '19 15:10 bjenkinsgit

Another update:

In the miner’s ‘config.json’ file, I changed the “keepPrehash” value from “true” to “false” and the miner seems to be stable now and no more errors in the ergo client console window.

So, although I technically have a GPU that has 12 Gbytes of memory, and therefore SHOULD be able to set “keepPrehash” to “true”, I think that either:

a. that feature requires a more modern GPU or b. My little Xeon 3 CPU can’t handle the data from processing data at 12 Gbytes from the GPU ?? (total guess here. That is, imagine I could suddenly drop in a more powerful CPU, keeping all else the same, would the miner and the ergo client be able to stay in sync?)

thanks for responding….I had given up hope there...

On Oct 9, 2019, at 11:50 AM, Bart Jenkins [email protected] wrote:

I’ll bet what happened is this:

I play games on this linux box and I might have left a game up and running,

…OR...

When exiting the video game I was playing, it did not release the GPU for some reason

I’ll try to reproduce this by running a game, leaving it running and then trying to start the miner…I’ll report back if that is the problem.

Is there some CUDA command (for the .cu file) to check for a GPU being “in-use” rather than just saying NO DEVICES FOUND? That would be a more meaningful error, no?

Thanks

On Oct 9, 2019, at 11:32 AM, Bart Jenkins <[email protected] mailto:[email protected]> wrote:

Ok. I’m a bit puzzled. It is now working. The miner binary now recognizes that I have 1 GPU and starts running. I did remake the CUDA toolkit and rebooted a few times since trying again. But at the time I left it, the miner was not running. It has now been a few days and it seems to be working again. I thought maybe it had something to do with the fact that some areas of the documentation mention that the config.json for the miner has to have the mnemonic phrase key listed as “mnemonic” and in other docs I’ve seen it as “seed”. But either entry allows the miner to run (which is the correct key, “mnemonic” or “seed” ?)

Now, from the miner, after a minute or so of processing, I am getting a lot of “ERROR: 500, REASON: INTERNAL ERROR” errors and the console for my ergo client (ergo-3.10.jar) is throwing lots of errors in the form of:

WARN [ctor.default-dispatcher-xxx] o.e.local.ErgoMiner - Failed to produce candidate block with a java error of: java.lang.Error: Trying to generate proof for empty transaction sequence….

Any ideas on what is happening now?

Thanks

Bart

On Oct 9, 2019, at 12:53 AM, rsmmnt <[email protected] mailto:[email protected]> wrote:

Hey, What error exactly does miner give you? Are you using latest version?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ergoplatform/Autolykos-GPU-miner/issues/60?email_source=notifications&email_token=ACBCYQPW2K5D33KS2G4TF4DQNVPWDA5CNFSM4I5FXXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAWSGMY#issuecomment-539829043, or mute the thread https://github.com/notifications/unsubscribe-auth/ACBCYQIOBF4QSPGECQJTN2TQNVPWDANCNFSM4I5FXXTQ.

bjenkinsgit avatar Oct 09 '19 15:10 bjenkinsgit

  1. Miner log is not your node log, it is written separately, please copy it here.

the DETAIL info on the 500 error says “requirement failed: Incorrect points" This means that miner found a solution for a wrong block data - check if your node is in sync (via http info vs block explorer)

WARN [ctor.default-dispatcher-xxx] o.e.local.ErgoMiner - Failed to produce candidate block with a java error of: java.lang.Error: Trying to generate proof for empty transaction sequence….

This also signals a node problem, not a miner one.

  1. you can check gpu memory load via nvidia-smi tool

rsmmnt avatar Oct 09 '19 20:10 rsmmnt