PowerInfer
PowerInfer copied to clipboard
no CUDA-capable device is detected
Tried to run inference on wsl ./build/bin/main -m ./ReluFalcon-40B-PowerInfer-GGUF/falcon-40b-relu.q4.powerinfer.gguf -n 128 -t 8 -p "Once upon a time" and got no CUDA-capable device is detected current device: 231936
suggestions on how to fix this?
I have 2 NVIDA cards in the computer a GeForce RTX2070 and Tesla M40 24GB
Same error for me, used model 13b WSL + 1 RTX 3080TI (12gb)
This is not necessarily a powerinfer issue. Can you check if nvidia-smi is working properly and detecting the GPU card(s)?
I am experiencing this same issue on WSL with an RTX 2070 SUPER. This is how I am building and running the project, with the error at the end:
tjf801@DESKTOP:~/PowerInfer$ cmake -S . -B build -DLLAMA_CUBLAT=ON
-- cuBLAS found
-- Using CUDA architectures: 52;61;70
GNU ld (GNU Binutils for Ubuntu) 2.38
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Configuring done
-- Generating done
-- Build files have been written to: /home/tjf801/PowerInfer/build
tjf801@DESKTOP:~/PowerInfer$ cmake --build build --config Release
[output omitted for sake of brevity]
tjf801@DESKTOP:~/PowerInfer$ ./build/bin/main -m ../ReluLLaMA-7B-PowerInfer-GGUF/llama-7b-relu.powerinfer.gguf -n 64 -t 12 -p "Once upon a time, "
Log start
main: build = 1556 (74c5c58)
main: built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
main: seed = 1704065009
llama_model_loader: loaded meta data with 18 key-value pairs and 355 tensors from ../ReluLLaMA-7B-PowerInfer-GGUF/llama-7b-relu.powerinfer.gguf (version GGUF V3 (latest))
llama_model_loader: [output omitted]
llama_model_load: PowerInfer model loaded. Sparse inference will be used.
llm_load_vocab: special tokens definition check successful ( 259/32000 ).
llm_load_print_meta: [output omitted]
llm_load_print_meta: sparse_pred_threshold = 0.00
llama_model_load: sparse inference - vram budget = -1.00 GB
llm_load_sparse_model_tensors: ggml ctx size = 0.13 MB
CUDA error 100 at /home/tjf801/PowerInfer/ggml-cuda.cu:9340: no CUDA-capable device is detected
current device: 136320
nvidia-smi
does manage to detect the card, and is working properly, and produces the following output:
tjf801@DESKTOP:~/PowerInfer$ nvidia-smi
Sun Dec 31 18:21:04 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.36 Driver Version: 546.33 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 2070 ... On | 00000000:01:00.0 On | N/A |
| 25% 29C P8 14W / 215W | 799MiB / 8192MiB | 2% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 32 G /Xwayland N/A |
+---------------------------------------------------------------------------------------+
我解决了。只需要重新安装CUDA就可以了