LLamaSharp
LLamaSharp copied to clipboard
[BUG]: WSL2 has problem running LLamaSharp with cuda11
Description
When running the examples, I got the following error.
ggml_cuda_compute_forward: RMS_NORM failed
CUDA error: the provided PTX was compiled with an unsupported toolchain.
current device: 0, in function ggml_cuda_compute_forward at /home/runner/work/LLamaSharp/LLamaSharp/ggml-cuda.cu:2305
err
GGML_ASSERT: /home/runner/work/LLamaSharp/LLamaSharp/ggml-cuda.cu:61: !"CUDA error"
[New LWP 32766]
[New LWP 32767]
[New LWP 32768]
[New LWP 32769]
[New LWP 32770]
[New LWP 32772]
[New LWP 32773]
[New LWP 32775]
[New LWP 33664]
[New LWP 33665]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f3add384c7f in __GI___wait4 (pid=33743, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:27
27 ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory.
#0 0x00007f3add384c7f in __GI___wait4 (pid=33743, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:27
27 in ../sysdeps/unix/sysv/linux/wait4.c
#1 0x00007ef9aa28cf8b in ggml_print_backtrace () from /home/rinne/code/csharp/LLamaSharp/LLama.Examples/bin/Debug/net7.0/runtimes/linux-x64/native/cuda12/libllama.so
#2 0x00007ef9aa392892 in ggml_cuda_error(char const*, char const*, char const*, int, char const*) () from /home/rinne/code/csharp/LLamaSharp/LLama.Examples/bin/Debug/net7.0/runtimes/linux-x64/native/cuda12/libllama.so
#3 0x00007ef9aa39d7e0 in ggml_backend_cuda_graph_compute(ggml_backend*, ggml_cgraph*) () from /home/rinne/code/csharp/LLamaSharp/LLama.Examples/bin/Debug/net7.0/runtimes/linux-x64/native/cuda12/libllama.so
#4 0x00007ef9aa2e167b in ggml_backend_sched_graph_compute_async () from /home/rinne/code/csharp/LLamaSharp/LLama.Examples/bin/Debug/net7.0/runtimes/linux-x64/native/cuda12/libllama.so
#5 0x00007ef9aa1e7ca0 in llama_decode () from /home/rinne/code/csharp/LLamaSharp/LLama.Examples/bin/Debug/net7.0/runtimes/linux-x64/native/cuda12/libllama.so
#6 0x00007f3a5f4b518e in ?? ()
#7 0x000000000000007b in ?? ()
#8 0x00007efa4a95a900 in ?? ()
#9 0x0000000000000000 in ?? ()
[Inferior 1 (process 32765) detached]
Reproduction Steps
Run the example of master branch in WSL2 (ubuntu22.04) with cuda 11.2 installed.
Environment & Configuration
- Operating system: WSL2 (Ubuntu22.04)
- .NET runtime version: 7.0
- LLamaSharp version: master
- CUDA version (if you are using cuda backend): 11.2
- CPU & GPU device: Intel i7 12700 + Nvidia RTX 3060Ti
Known Workarounds
No response