gpt4all
gpt4all copied to clipboard
Core dumping
(gdb) run
Starting program: /opt/gpt4all 0.1.0/bin/chat
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
QML debugging is enabled. Only use this in a safe environment.
qt.qpa.plugin: Could not find the Qt platform plugin "wayland" in ""
[New Thread 0x7ffff3fba640 (LWP 1341004)]
[New Thread 0x7ffff37b9640 (LWP 1341005)]
[New Thread 0x7ffff2fb8640 (LWP 1341006)]
[New Thread 0x7ffff1d73640 (LWP 1341007)]
gptj_model_load: loading model from 'ggml-gpt4all-j.bin' - please wait ...
gptj_model_load: n_vocab = 50400
gptj_model_load: n_ctx = 2048
gptj_model_load: n_embd = 4096
gptj_model_load: n_head = 16
gptj_model_load: n_layer = 28
gptj_model_load: n_rot = 64
gptj_model_load: f16 = 2
Thread 5 "llm thread" received signal SIGILL, Illegal instruction.
[Switching to Thread 0x7ffff1d73640 (LWP 1341007)]
0x0000555555db735d in ggml_type_sizef (type=GGML_TYPE_F32) at /home/atreat/dev/large_language_models/gpt4all-chat/ggml/src/ggml.c:2646
2646 /home/atreat/dev/large_language_models/gpt4all-chat/ggml/src/ggml.c: No such file or directory.
(gdb) disassemble
Dump of assembler code for function ggml_type_sizef:
0x0000555555db7350 <+0>: endbr64
0x0000555555db7354 <+4>: mov %edi,%edi
0x0000555555db7356 <+6>: lea 0x12f4723(%rip),%rax # 0x5555570aba80 <GGML_TYPE_SIZE>
=> 0x0000555555db735d <+13>: vxorps %xmm1,%xmm1,%xmm1
0x0000555555db7361 <+17>: mov (%rax,%rdi,8),%rax
0x0000555555db7365 <+21>: test %rax,%rax
0x0000555555db7368 <+24>: js 0x555555db7380 <ggml_type_sizef+48>
0x0000555555db736a <+26>: vcvtsi2ss %rax,%xmm1,%xmm0
0x0000555555db736f <+31>: lea 0x12f474a(%rip),%rax # 0x5555570abac0 <GGML_BLCK_SIZE>
0x0000555555db7376 <+38>: vcvtsi2ssl (%rax,%rdi,4),%xmm1,%xmm1
0x0000555555db737b <+43>: vdivss %xmm1,%xmm0,%xmm0
0x0000555555db737f <+47>: ret
0x0000555555db7380 <+48>: mov %rax,%rdx
0x0000555555db7383 <+51>: and $0x1,%eax
0x0000555555db7386 <+54>: shr %rdx
0x0000555555db7389 <+57>: or %rax,%rdx
0x0000555555db738c <+60>: vcvtsi2ss %rdx,%xmm1,%xmm0
0x0000555555db7391 <+65>: vaddss %xmm0,%xmm0,%xmm0
0x0000555555db7395 <+69>: jmp 0x555555db736f <ggml_type_sizef+31>
End of assembler dump.
(gdb) bt
#0 0x0000555555db735d in ggml_type_sizef (type=GGML_TYPE_F32) at /home/atreat/dev/large_language_models/gpt4all-chat/ggml/src/ggml.c:2646
#1 0x00005555557ecdc7 in gptj_model_load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::istream&, gptj_model&, gpt_vocab&) (fname="ggml-gpt4all-j.bin", fin=..., model=..., vocab=<optimized out>)
at /home/atreat/dev/large_language_models/gpt4all-chat/gptj.cpp:160
#2 0x00005555557f0b7d in GPTJ::loadModel(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::istream&)
(this=this@entry=0x5555588aac20, modelPath="ggml-gpt4all-j.bin", fin=...) at /home/atreat/dev/large_language_models/gpt4all-chat/gptj.cpp:652
#3 0x00005555557f2d3d in GPTJObject::loadModel() (this=0x5555588aab30) at /home/atreat/dev/large_language_models/gpt4all-chat/llm.cpp:48
#4 0x0000555555898bd9 in void doActivate<false>(QObject*, int, void**) ()
#5 0x000055555592681e in QThread::started(QThread::QPrivateSignal) ()
#6 0x000055555597b69f in QThreadPrivate::start(void*) ()
#7 0x00007ffff675bb43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#8 0x00007ffff67eda00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb) q```
Looks like I'm missing the AVX instruction on my CPU.
username@computer:/opt/gpt4all 0.1.0/bin$ grep -E 'avx|avx2' /proc/cpuinfo
username@computer:/opt/gpt4all 0.1.0/bin$ sudo lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 39 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Vendor ID: GenuineIntel
Model name: Intel(R) Pentium(R) CPU 5405U @ 2.30GHz
CPU family: 6
Model: 142
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
Stepping: 11
CPU max MHz: 2300.0000
CPU min MHz: 400.0000
BogoMIPS: 4599.93
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe sysca
ll nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmu
lqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer ae
s xsave rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept
vpid ept_ad fsgsbase tsc_adjust smep erms invpcid mpx rdseed smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm
arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d arch_capabilities
Virtualization features:
Virtualization: VT-x
Caches (sum of all):
L1d: 64 KiB (2 instances)
L1i: 64 KiB (2 instances)
L2: 512 KiB (2 instances)
L3: 2 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-3
Vulnerabilities:
Itlb multihit: KVM: Mitigation: VMX disabled
L1tf: Not affected
Mds: Mitigation; Clear CPU buffers; SMT vulnerable
Meltdown: Not affected
Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable
Retbleed: Mitigation; IBRS
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS Not affected
Srbds: Mitigation; Microcode
Tsx async abort: Not affected
It would be epic (and probably epically slow) to have a version of this for lower end budget CPUs.
I was able to solve this by recompiling from source following the directions here: https://github.com/zanussbaum/gpt4all.cpp Then starting with a -m param to choose the model Slow as dirt on my old machine, but it does seem to work now!
Definitely works in some unexpected ways...
username@computer:~/Projects/aixcelus/gpt4all-build/gpt4all.cpp$ ./chat -m ../../gpt4all/gpt4all-lora-unfiltered-quantized.bin
main: seed = 1681707857
llama_model_load: loading model from '../../gpt4all/gpt4all-lora-unfiltered-quantized.bin' - please wait ...
llama_model_load: ggml ctx size = 6065.35 MB
llama_model_load: memory_size = 2048.00 MB, n_mem = 65536
llama_model_load: loading model part 1/1 from '../../gpt4all/gpt4all-lora-unfiltered-quantized.bin'
llama_model_load: .................................... done
llama_model_load: model size = 4017.27 MB / num tensors = 291
system_info: n_threads = 4 / 4 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
main: interactive mode on.
sampling parameters: temp = 0.100000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000
== Running in chat mode. ==
- Press Ctrl+C to interject at any time.
- Press Return to return control to LLaMA.
- If you want to submit another line, end your input in '\'.
Tell me about Alpacas
> Tell me about Alpacas
```python
import os
os.system('ls') # List contents of current directory
Stale, please open a new, updated issue if this is still relevant to you.