[BUG]: WSL2 has problem running LLamaSharp with cuda11

Open SanftMonster opened this issue 1 year ago • 0 comments

Description

When running the examples, I got the following error.

ggml_cuda_compute_forward: RMS_NORM failed
CUDA error: the provided PTX was compiled with an unsupported toolchain.
  current device: 0, in function ggml_cuda_compute_forward at /home/runner/work/LLamaSharp/LLamaSharp/ggml-cuda.cu:2305
  err
GGML_ASSERT: /home/runner/work/LLamaSharp/LLamaSharp/ggml-cuda.cu:61: !"CUDA error"
[New LWP 32766]
[New LWP 32767]
[New LWP 32768]
[New LWP 32769]
[New LWP 32770]
[New LWP 32772]
[New LWP 32773]
[New LWP 32775]
[New LWP 33664]
[New LWP 33665]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f3add384c7f in __GI___wait4 (pid=33743, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:27
27      ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory.
#0  0x00007f3add384c7f in __GI___wait4 (pid=33743, stat_loc=0x0, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:27
27      in ../sysdeps/unix/sysv/linux/wait4.c
#1  0x00007ef9aa28cf8b in ggml_print_backtrace () from /home/rinne/code/csharp/LLamaSharp/LLama.Examples/bin/Debug/net7.0/runtimes/linux-x64/native/cuda12/libllama.so
#2  0x00007ef9aa392892 in ggml_cuda_error(char const*, char const*, char const*, int, char const*) () from /home/rinne/code/csharp/LLamaSharp/LLama.Examples/bin/Debug/net7.0/runtimes/linux-x64/native/cuda12/libllama.so
#3  0x00007ef9aa39d7e0 in ggml_backend_cuda_graph_compute(ggml_backend*, ggml_cgraph*) () from /home/rinne/code/csharp/LLamaSharp/LLama.Examples/bin/Debug/net7.0/runtimes/linux-x64/native/cuda12/libllama.so
#4  0x00007ef9aa2e167b in ggml_backend_sched_graph_compute_async () from /home/rinne/code/csharp/LLamaSharp/LLama.Examples/bin/Debug/net7.0/runtimes/linux-x64/native/cuda12/libllama.so
#5  0x00007ef9aa1e7ca0 in llama_decode () from /home/rinne/code/csharp/LLamaSharp/LLama.Examples/bin/Debug/net7.0/runtimes/linux-x64/native/cuda12/libllama.so
#6  0x00007f3a5f4b518e in ?? ()
#7  0x000000000000007b in ?? ()
#8  0x00007efa4a95a900 in ?? ()
#9  0x0000000000000000 in ?? ()
[Inferior 1 (process 32765) detached]

Reproduction Steps

Run the example of master branch in WSL2 (ubuntu22.04) with cuda 11.2 installed.

Environment & Configuration

Operating system: WSL2 (Ubuntu22.04)
.NET runtime version: 7.0
LLamaSharp version: master
CUDA version (if you are using cuda backend): 11.2
CPU & GPU device: Intel i7 12700 + Nvidia RTX 3060Ti

Known Workarounds

No response

May 10 '24 18:05 SanftMonster