pyguppyclient icon indicating copy to clipboard operation
pyguppyclient copied to clipboard

error while running guppy 4.4.1 on GPU mode

Open sanaek opened this issue 4 years ago • 9 comments

Hi, im currently trying to run guppy on a cluster with GPU mode and I get the following error

[guppy/error] main: CuBLAS error at /builds/ofan/ont_core_cpp/ont_core/basecall_core/nn_caller_cuda.cpp:1138: 1 [guppy/warning] main: An error has occurred. Aborting

I'm using the following code

guppy_basecaller --input_path /fast5_9a --save_path fastq -r -x "auto" --compress_fastq -c dna_r9.4.1_450bps_fast.cfg

Can anyone help me on this?

Than you

sanaek avatar Jan 15 '21 12:01 sanaek

same error here, from command line

noncodo avatar Jan 22 '21 20:01 noncodo

I was running CUDA version 10.2 When updating via the ONT debian repo, this warning pops up: The following packages have unmet dependencies: ont-guppy : Depends: libcuda-11.1-1 or libcuda1 (>= 455)
Looks like I need to update CUDA :'(

noncodo avatar Jan 22 '21 20:01 noncodo

Yes but you can install CUDA 11.1 alongside CUDA 10.2.

iiSeymour avatar Jan 24 '21 23:01 iiSeymour

Hi, I have the same problem GPU mode with those versions: guppy 4.4.1 (working well alone) megalodon I have 2.2.9 ont_pyguppy_client_lib 0.0.9

the HPC admins here won't have libCUDA11 installed before March, is it really what is needed to solve the problem? @noncodo did it work for you?

thanks in advance

ginolhac avatar Jan 29 '21 12:01 ginolhac

Not feeling brave enough to install a local version of CUDA on the HPC... will also wait for the admin to do the upgrade...

TristanLefebure avatar Feb 10 '21 08:02 TristanLefebure

Hi, has anyone succeeded with installation of CUDA v11.1 without updating the drivers? Our sysadmins updated CUDA (new version is loaded via module load),

$ nvcc --version
Build cuda_11.1.TC455_06.29190527_0

but guppy 4.4+ still fails with an error.

[guppy/error] main: CuBLAS error at /builds/ofan/ont_core_cpp/ont_core/basecall_core/nn_caller_cuda.cpp:1138: 1
[guppy/warning] main: An error occurred initialising the basecaller. Aborting.

nvidia-smi still shows old CUDA version...

$ nvidia-smi
| NVIDIA-SMI 440.82       Driver Version: 440.82       CUDA Version: 10.2     |

Does the new CUDA has to be bound to the drivers somehow? Or is it related to module load somehow?

lpryszcz avatar Mar 22 '21 13:03 lpryszcz

We were seeing the same issue with guppy Version 4.5.2+bcc53d3. Updated the nvidia driver to NVIDIA-SMI 460.67 Driver Version: 460.67 CUDA Version: 11.2 and all seems well. You'll still need cuda_11 through your module load.

tobydarling avatar Mar 29 '21 12:03 tobydarling

The nvidia-smi - shows what the driver installed is capable of running for CUDA versions. The actual CUDA version is in the compiler and the nvidia-cuda-toolkit, etc.

If you wish to use the new driver with CUDA 11.* support: apt install nvidia-driver-460 For Cuda versions: for example in case your packages vary apt search cuda-toolkit apt install nvidia-cuda-toolkit cuda-toolkit-11-2

$ nvidia-smi | head -3 Mon Apr 12 15:54:29 2021
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 460.56 Driver Version: 460.56 CUDA Version: 11.2 |

You can run CUDA 10 on a NVIDIA 460.56 driver (that has support for CUDA 11.2). If that makes it clearer.

markwdalton avatar Apr 12 '21 20:04 markwdalton

Has this worked for anyone? @markwdalton I tried updating the drivers to 460.67 and CUDA 11.2, and still getting the same error. If you are using Docker, can you please share your Dockerfile? I have no trouble running Guppy outside of docker but inside the docker, it throws this error:

CuBLAS error at /builds/ofan/ont_core_cpp/ont_core/basecall_core/nn_caller_cuda.cpp:163: 13

Dtamiev avatar Apr 16 '21 14:04 Dtamiev