llamafile
llamafile copied to clipboard
Error when using -ngl > 0
Hia, I'm able to run on CPU, but when I try to run with GPU using:
$ ./llamafile.exe -m Meta-Llama-3-70B-Instruct.Q4_0.llamafile -ngl 999 --port 7777
I get the following crash and error message:
initializing gpu module...
dynamically linking /C/Users/BoylJos/AppData/Local/Temp/6/.llamafile/ggml-rocm.dll
library not found: failed to load library
dynamically linking /C/Users/BoylJos/AppData/Local/Temp/6/.llamafile/ggml-cuda.dll
GPU support successfully linked and loaded
error: Uncaught SIGSEGV (SEGV_-1073741784) at 0x7ffc1766660d on XB-EHSAI-1 pid 19752 tid 23984
/E/ai/llm/llamafile.exe
No such file or directory
Windows Cosmopolitan 3.2.4 MODE=x86_64 XB-EHSAI-1 10.0-20348
RAX 0000000000000000 RBX 00007000007d8301 RDI 0000000000000000
RCX 0000000000000000 RDX 0000000000000000 RSI 00007000007d83d0
RBP 00007000007d81d0 RSP 00007000007d7bd0 RIP 00007ffc1766660d
R8 0000000000000000 R9 0000000000020000 R10 00007000007d6e60
R11 00007000007d6e60 R12 00007ffc0e4c5e96 R13 0000000000000000
R14 00007000007d8940 R15 0000000000000000
TLS 0000000000719640
XMM0 00000028000000280000000000000000 XMM8 00000000000000000000000000000000
XMM1 00000000000000000000000000000001 XMM9 00000000000000000000000000000000
XMM2 00000000000000000000000000000000 XMM10 00000000000000000000000000000000
XMM3 00007000007d70300000000000800000 XMM11 00000000000000000000000000000000
XMM4 00000000000000000000000000000000 XMM12 00000000000000000000000000000000
XMM5 00007000000000280000000000000000 XMM13 00000000000000000000000000000000
XMM6 0073006f004a006c0079006f0042005c XMM14 00000000000000000000000000000000
XMM7 00730072006500730055005c003a0043 XMM15 00000000000000000000000000000000
cosmoaddr2line /E/ai/llm/llamafile.exe 7ffc1766660d 7ffc1758cb71
0000007ee5e0 7ffc1766660d NULL+0
7000007d81d0 7ffc1758cb71 NULL+0
I'm using v0.6 of the project, if that helps
Please run:
./llamafile.exe -m Meta-Llama-3-70B-Instruct.Q4_0.llamafile -ngl 999 --port 7777 --strace
And copy / paste me the last 20 or so lines that happen before it crashes.
Next, run this command:
./llamafile.exe -m Meta-Llama-3-70B-Instruct.Q4_0.llamafile -ngl 999 --port 7777 --ftrace --strace
And copy / paste the last 20 or so lines it prints into this issue, and ideally include the full log attached too.
Thanks! Sorry you're encountering this issue :'(
Also, I just have to ask, do you really have a graphics card on Windows with 50 GB of VRAM?
Hi @jart - I'm on the same team as @JosephSBoyle so will be replying a bit as well when they aren't available
We are running a Quadro RTX 5000 with 16GB dedicated. Plus 128GB RAM and yes on Windows.
Here is the --strace
dynamically linking C:\Users\ClarBen2/.llamafile/ggml-cuda.dll
SYS 25728 25044 1'877'073'168 write(2, u"dynamically linking C:\\Users\\ClarBen2"..., 68) → 68 ENOENT
SYS 25728 25044 1'879'818'000 dlopen("C:\\Users\\ClarBen2/.llamafile/ggml-cuda.dll", 1) → 0x7ffb9e440000 ENOENT
SYS 25728 25044 1'879'927'005 dlsym(0x7ffb9e440000, "ggml_cuda_link") → 0x7ffb9e465d30
SYS 25728 25044 1'879'959'935 dlsym(0x7ffb9e440000, "ggml_init_cublas") → 0x7ffb9e4666d0
SYS 25728 25044 1'879'987'069 dlsym(0x7ffb9e440000, "ggml_cublas_loaded") → 0x7ffb9e464fa0
SYS 25728 25044 1'880'013'122 dlsym(0x7ffb9e440000, "ggml_cuda_host_free") → 0x7ffb9e465b50
SYS 25728 25044 1'880'039'062 dlsym(0x7ffb9e440000, "ggml_cuda_host_malloc") → 0x7ffb9e465c50
SYS 25728 25044 1'880'065'434 dlsym(0x7ffb9e440000, "ggml_cuda_can_mul_mat") → 0x7ffb9e4650f0
SYS 25728 25044 1'880'091'412 dlsym(0x7ffb9e440000, "ggml_cuda_set_tensor_split") → 0x7ffb9e465e80
SYS 25728 25044 1'880'117'834 dlsym(0x7ffb9e440000, "ggml_cuda_transform_tensor") → 0x7ffb9e4660b0
SYS 25728 25044 1'880'144'584 dlsym(0x7ffb9e440000, "ggml_cuda_free_data") → 0x7ffb9e465690
SYS 25728 25044 1'880'171'465 dlsym(0x7ffb9e440000, "ggml_cuda_assign_buffers") → 0x7ffb9e464fb0
SYS 25728 25044 1'880'197'696 dlsym(0x7ffb9e440000, "ggml_cuda_assign_buffers_no_scratch") → 0x7ffb9e464fe0
SYS 25728 25044 1'880'224'400 dlsym(0x7ffb9e440000, "ggml_cuda_assign_buffers_force_inplace") → 0x7ffb9e464fc0
SYS 25728 25044 1'880'251'886 dlsym(0x7ffb9e440000, "ggml_cuda_assign_buffers_no_alloc") → 0x7ffb9e464fd0
SYS 25728 25044 1'880'279'032 dlsym(0x7ffb9e440000, "ggml_cuda_assign_scratch_offset") → 0x7ffb9e464ff0
SYS 25728 25044 1'880'305'586 dlsym(0x7ffb9e440000, "ggml_cuda_copy_to_device") → 0x7ffb9e465540
SYS 25728 25044 1'880'331'526 dlsym(0x7ffb9e440000, "ggml_cuda_set_main_device") → 0x7ffb9e465d40
SYS 25728 25044 1'880'358'330 dlsym(0x7ffb9e440000, "ggml_cuda_set_scratch_size") → 0x7ffb9e465e40
SYS 25728 25044 1'880'385'050 dlsym(0x7ffb9e440000, "ggml_cuda_free_scratch") → 0x7ffb9e4658b0
SYS 25728 25044 1'880'411'198 dlsym(0x7ffb9e440000, "ggml_cuda_compute_forward") → 0x7ffb9e465190
SYS 25728 25044 1'880'437'776 dlsym(0x7ffb9e440000, "ggml_cuda_get_device_count") → 0x7ffb9e4659d0
SYS 25728 25044 1'880'464'342 dlsym(0x7ffb9e440000, "ggml_cuda_get_device_description") → 0x7ffb9e4659f0
SYS 25728 25044 1'880'491'582 dlsym(0x7ffb9e440000, "ggml_backend_reg_cuda_init") → 0x7ffb9e464f90
SYS 25728 25044 1'880'518'204 dlsym(0x7ffb9e440000, "ggml_backend_cuda_host_buffer_type") → 0x7ffb9e464dc0
SYS 25728 25044 1'880'544'752 dlsym(0x7ffb9e440000, "ggml_backend_cuda_buffer_type") → 0x7ffb9e464ae0
SYS 25728 25044 1'880'571'894 dlsym(0x7ffb9e440000, "ggml_backend_cuda_init") → 0x7ffb9e464e80
GPU support successfully linked and loaded
SYS 25728 25044 1'880'635'518 write(2, u"GPU support successfully linked and load"..., 43) → 43 ENOENT
SYS 25728 25044 1'895'807'158 win32 vectored exception 0xC0000028u raising SIGSEGV cosmoaddr2line /e/ai/llm/llamafile.exe 7ffc1766660d 7ffc1758cb71
SYS 25728 25044 1'895'927'856 resetting SIGSEGV handler
SYS 25728 25044 1'895'998'713 gethostname(["****"], 64) → 0 ENOENT
SYS 25728 25044 1'896'058'841 uname([{"Windows", "****", "10.0-20348", "Cosmopolitan 3.2.4 MODE=x86_64", "x86_64", "****"}]) → 0 ENOENT
error: Uncaught SIGSEGV (SEGV_-1073741784) at 0x7ffc1766660d on **** pid 25728 tid 25044
llamafile.exe
No such file or directory
Windows Cosmopolitan 3.2.4 MODE=x86_64 **** 10.0-20348
RAX 0000000000000000 RBX 00007000007d8301 RDI 0000000000000000
RCX 0000000000000000 RDX 0000000000000000 RSI 00007000007d83d0
RBP 00007000007d81d0 RSP 00007000007d7bd0 RIP 00007ffc1766660d
R8 0000000000000000 R9 0000000000020000 R10 00007000007d6e60
R11 00007000007d6e60 R12 00007ffc0e4c5e96 R13 0000000000000000
R14 00007000007d8940 R15 0000000000000000
TLS 0000000000719640
XMM0 00000028000000280000000000000000 XMM8 00000000000000000000000000000000
XMM1 00000000000000000000000000000001 XMM9 00000000000000000000000000000000
XMM2 00000000000000000000000000000000 XMM10 00000000000000000000000000000000
XMM3 00007000007d70300000000000800000 XMM11 00000000000000000000000000000000
XMM4 00000000000000000000000000000000 XMM12 00000000000000000000000000000000
XMM5 00007000000000280000000000000000 XMM13 00000000000000000000000000000000
XMM6 006e0065004200720061006c0043005c XMM14 00000000000000000000000000000000
XMM7 00730072006500730055005c003a0043 XMM15 00000000000000000000000000000000
cosmoaddr2line /e/ai/llm/llamafile.exe 7ffc1766660d 7ffc1758cb71
and the --ftrace
SYS 27568 22732 6'065'652'658 write(2, u"GPU support successfully linked and load"..., 43) → 43 ENOENT
FUN 27568 22732 6'065'680'558 14'296 &pthread_setcancelstate
FUN 27568 22732 6'081'232'534 416 &__sig_unmaskable
FUN 27568 22732 6'097'196'570 1'632 &FindDebugBinary
FUN 27568 22732 6'097'247'524 1'648 &cosmo_once
SYS 27568 22732 6'097'274'342 win32 vectored exception 0xC0000028u raising SIGSEGV cosmoaddr2line /e/ai/llm/llamafile.exe 7ffc1766660d 7ffc1758cb71
SYS 27568 22732 6'097'305'890 resetting SIGSEGV handler
FUN 27568 22732 6'097'331'046 1'632 &_ntcontext2linux
FUN 27568 22732 6'097'358'251 1'632 &__oncrash
FUN 27568 22732 6'097'383'663 1'680 &__errno_location
FUN 27568 22732 6'097'409'130 1'680 &IsDebuggerPresent
FUN 27568 22732 6'097'435'668 1'680 &__restore_tty
FUN 27568 22732 6'097'461'618 1'680 &ShowCrashReport
FUN 27568 22732 6'097'491'572 10'960 &gethostname
FUN 27568 22732 6'097'516'844 11'008 &gethostname_nt
FUN 27568 22732 6'097'545'528 11'824 &GetComputerNameEx
FUN 27568 22732 6'097'615'336 11'824 &tprecode16to8
FUN 27568 22732 6'097'641'945 11'904 &_tpenc
FUN 27568 22732 6'097'666'851 11'824 &memccpy
SYS 27568 22732 6'097'692'090 gethostname(["****"], 64) → 0 ENOENT
FUN 27568 22732 6'097'720'818 10'960 &uname
FUN 27568 22732 6'097'746'406 11'536 &__FormatUint32
FUN 27568 22732 6'097'771'712 11'536 &GetComputerNameEx
FUN 27568 22732 6'097'813'014 11'536 &tprecode16to8
FUN 27568 22732 6'097'838'758 11'616 &_tpenc
FUN 27568 22732 6'097'863'384 11'536 &GetComputerNameEx
FUN 27568 22732 6'097'907'488 11'536 &tprecode16to8
FUN 27568 22732 6'097'932'873 11'616 &_tpenc
FUN 27568 22732 6'097'957'875 11'536 &strlcat
FUN 27568 22732 6'097'983'017 11'536 &strlcpy
SYS 27568 22732 6'098'007'884 uname([{"Windows", "****", "10.0-20348", "Cosmopolitan 3.2.4 MODE=x86_64", "x86_64", "****"}]) → 0 ENOENT
FUN 27568 22732 6'098'038'743 10'960 &__errno_location
FUN 27568 22732 6'098'147'101 10'960 &strerror
FUN 27568 22732 6'098'193'570 10'976 &strerror_r
FUN 27568 22732 6'098'220'979 11'056 &GetLastError
FUN 27568 22732 6'098'248'082 10'960 &gettid
FUN 27568 22732 6'098'273'601 10'960 &getpid
FUN 27568 22732 6'098'299'684 10'960 &__is_stack_overflow
FUN 27568 22732 6'098'326'616 10'992 &getauxval
FUN 27568 22732 6'098'351'991 11'024 &__getauxval
FUN 27568 22732 6'098'378'499 10'992 &DescribeSiCode
FUN 27568 22732 6'098'404'772 11'040 &stpcpy
FUN 27568 22732 6'098'430'309 11'040 &__FormatInt32
FUN 27568 22732 6'098'455'885 11'056 &__FormatUint32
FUN 27568 22732 6'098'481'734 11'040 &stpcpy
FUN 27568 22732 6'098'507'145 11'040 &__FormatInt32
FUN 27568 22732 6'098'532'882 11'056 &__FormatUint32
FUN 27568 22732 6'098'561'946 10'992 &stpcpy
FUN 27568 22732 6'098'588'569 10'992 &DescribeCpuFlags.isra.0
FUN 27568 22732 6'098'739'741 11'024 &AddFlag
FUN 27568 22732 6'098'889'288 10'992 &ShowSseRegisters
error: Uncaught SIGSEGV (SEGV_-1073741784) at 0x7ffc1766660d on **** pid 27568 tid 22732
llamafile.exe
No such file or directory
Windows Cosmopolitan 3.2.4 MODE=x86_64 **** 10.0-20348
RAX 0000000000000000 RBX 00007000007d8301 RDI 0000000000000000
RCX 0000000000000000 RDX 0000000000000000 RSI 00007000007d83d0
RBP 00007000007d81d0 RSP 00007000007d7bd0 RIP 00007ffc1766660d
R8 0000000000000000 R9 0000000000020000 R10 00007000007d6e60
R11 00007000007d6e60 R12 00007ffc0e4c5e96 R13 0000000000000000
R14 00007000007d8940 R15 0000000000000000
TLS 0000000000719640
XMM0 00000028000000280000000000000000 XMM8 00000000000000000000000000000000
XMM1 00000000000000000000000000000001 XMM9 00000000000000000000000000000000
XMM2 00000000000000000000000000000000 XMM10 00000000000000000000000000000000
XMM3 00007000007d70300000000000800000 XMM11 00000000000000000000000000000000
XMM4 00000000000000000000000000000000 XMM12 00000000000000000000000000000000
XMM5 00007000000000280000000000000000 XMM13 00000000000000000000000000000000
XMM6 006e0065004200720061006c0043005c XMM14 00000000000000000000000000000000
XMM7 00730072006500730055005c003a0043 XMM15 00000000000000000000000000000000