PowerInfer icon indicating copy to clipboard operation
PowerInfer copied to clipboard

no CUDA-capable device is detected

Open jasonmhead opened this issue 1 year ago • 4 comments

Tried to run inference on wsl ./build/bin/main -m ./ReluFalcon-40B-PowerInfer-GGUF/falcon-40b-relu.q4.powerinfer.gguf -n 128 -t 8 -p "Once upon a time" and got no CUDA-capable device is detected current device: 231936

suggestions on how to fix this?

I have 2 NVIDA cards in the computer a GeForce RTX2070 and Tesla M40 24GB

jasonmhead avatar Dec 21 '23 18:12 jasonmhead

Same error for me, used model 13b WSL + 1 RTX 3080TI (12gb)

dixan51 avatar Dec 22 '23 00:12 dixan51

This is not necessarily a powerinfer issue. Can you check if nvidia-smi is working properly and detecting the GPU card(s)?

samikrc avatar Dec 24 '23 15:12 samikrc

I am experiencing this same issue on WSL with an RTX 2070 SUPER. This is how I am building and running the project, with the error at the end:

tjf801@DESKTOP:~/PowerInfer$ cmake -S . -B build -DLLAMA_CUBLAT=ON
-- cuBLAS found
-- Using CUDA architectures: 52;61;70
GNU ld (GNU Binutils for Ubuntu) 2.38
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done
-- Generating done
-- Build files have been written to: /home/tjf801/PowerInfer/build
tjf801@DESKTOP:~/PowerInfer$ cmake --build build --config Release
[output omitted for sake of brevity]
tjf801@DESKTOP:~/PowerInfer$ ./build/bin/main -m ../ReluLLaMA-7B-PowerInfer-GGUF/llama-7b-relu.powerinfer.gguf -n 64 -t 12 -p "Once upon a time, "
Log start
main: build = 1556 (74c5c58)
main: built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
main: seed  = 1704065009
llama_model_loader: loaded meta data with 18 key-value pairs and 355 tensors from ../ReluLLaMA-7B-PowerInfer-GGUF/llama-7b-relu.powerinfer.gguf (version GGUF V3 (latest))
llama_model_loader: [output omitted]
llama_model_load: PowerInfer model loaded. Sparse inference will be used.
llm_load_vocab: special tokens definition check successful ( 259/32000 ).
llm_load_print_meta: [output omitted]
llm_load_print_meta: sparse_pred_threshold = 0.00
llama_model_load: sparse inference - vram budget = -1.00 GB
llm_load_sparse_model_tensors: ggml ctx size =    0.13 MB

CUDA error 100 at /home/tjf801/PowerInfer/ggml-cuda.cu:9340: no CUDA-capable device is detected
current device: 136320

nvidia-smi does manage to detect the card, and is working properly, and produces the following output:

tjf801@DESKTOP:~/PowerInfer$ nvidia-smi
Sun Dec 31 18:21:04 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.36                 Driver Version: 546.33       CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 2070 ...    On  | 00000000:01:00.0  On |                  N/A |
| 25%   29C    P8              14W / 215W |    799MiB /  8192MiB |      2%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A        32      G   /Xwayland                                 N/A      |
+---------------------------------------------------------------------------------------+

tjf801 avatar Dec 31 '23 23:12 tjf801

我解决了。只需要重新安装CUDA就可以了

NerounCstate avatar Mar 03 '24 11:03 NerounCstate