ipex-llm icon indicating copy to clipboard operation
ipex-llm copied to clipboard

Cant run qwen2.5-coder:14b-base-q2_K with Ollama

Open WizardlyBump17 opened this issue 7 months ago • 3 comments

Describe the bug When I run ./ollama run qwen2.5-coder:14b-base-q2_K with Ollama from ipex_llm[cpp]==2.3.0b20250802 I get the following error: Error: llama runner process has terminated: GGML_ASSERT(offset + size <= ggml_nbytes(tensor) && "tensor write out of bounds") failed

How to reproduce Steps to reproduce the error:

  1. pip install ipex_llm[cpp]==2.3.0b20250802
  2. ./ollama run qwen2.5-coder:14b-base-q2_K

Environment information

(ipex-llm) davi@davi:~/AI/ollama-ipex/latest$ bash env-check.sh 
-----------------------------------------------------------------
PYTHON_VERSION=3.11.13
-----------------------------------------------------------------
transformers=4.44.2
-----------------------------------------------------------------
torch=2.2.0+cu121
-----------------------------------------------------------------
ipex-llm Version: 2.3.0b20250802
-----------------------------------------------------------------
IPEX is not installed. 
-----------------------------------------------------------------
CPU Information: 
Architecture:                         x86_64
CPU op-mode(s):                       32-bit, 64-bit
Address sizes:                        48 bits physical, 48 bits virtual
Byte Order:                           Little Endian
CPU(s):                               16
On-line CPU(s) list:                  0-15
Vendor ID:                            AuthenticAMD
Model name:                           AMD Ryzen 7 5700X3D 8-Core Processor
CPU family:                           25
Model:                                33
Thread(s) per core:                   2
Core(s) per socket:                   8
Socket(s):                            1
Stepping:                             2
Frequency boost:                      enabled
CPU(s) scaling MHz:                   61%
CPU max MHz:                          4151.0000
-----------------------------------------------------------------
Total CPU Memory: 22.8791 GB
-----------------------------------------------------------------
Operating System: 
Ubuntu 25.04 \n \l

-----------------------------------------------------------------
Linux davi 6.14.0-27-generic #27-Ubuntu SMP PREEMPT_DYNAMIC Tue Jul 22 17:01:58 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
-----------------------------------------------------------------
CLI:
    Version: 1.2.41.20250513
    Build ID: 00000000

Service:
    Version: 1.2.41.20250513
    Build ID: 00000000
    Level Zero Version: 1.21.9
-----------------------------------------------------------------
  Driver UUID                                     32303235-2e32-302e-362e-302e30345f32
  Driver Version                                  2025.20.6.0.04_224945
  Driver UUID                                     32352e32-322e-3333-3934-340000000000
  Driver Version                                  25.22.33944
-----------------------------------------------------------------
Driver related package version:
-----------------------------------------------------------------
igpu not detected
-----------------------------------------------------------------
xpu-smi is properly installed. 
-----------------------------------------------------------------
No device discovered
GPU0 Memory size=16G
-----------------------------------------------------------------
0b:00.0 VGA compatible controller: Intel Corporation Battlemage G21 [Arc B580] (prog-if 00 [VGA controller])
	Subsystem: Intel Corporation Device 1100
	Flags: bus master, fast devsel, latency 0, IRQ 74, IOMMU group 20
	Memory at f5000000 (64-bit, non-prefetchable) [size=16M]
	Memory at 7800000000 (64-bit, prefetchable) [size=16G]
	Expansion ROM at f6000000 [disabled] [size=2M]
	Capabilities: <access denied>
	Kernel driver in use: xe
	Kernel modules: xe
-----------------------------------------------------------------

Additional context Ubuntu 25.04, kernel 6.14.0-27-generic, Arc B580, 24Gb RAM, intel-oneapi-base-toolkit-2025.0

Full log:

time=2025-08-02T17:02:53.138-03:00 level=INFO source=server.go:135 msg="system memory" total="22.9 GiB" free="15.6 GiB" free_swap="15.2 GiB"
time=2025-08-02T17:02:53.138-03:00 level=INFO source=server.go:187 msg=offload library=cpu layers.requested=-1 layers.model=49 layers.offload=0 layers.split="" memory.available="[15.6 GiB]" memory.gpu_overhead="0 B" memory.required.full="6.3 GiB" memory.required.partial="0 B" memory.required.kv="768.0 MiB" memory.required.allocations="[6.3 GiB]" memory.weights.total="5.1 GiB" memory.weights.repeating="4.5 GiB" memory.weights.nonrepeating="609.1 MiB" memory.graph.full="348.0 MiB" memory.graph.partial="916.1 MiB"
llama_model_loader: loaded meta data with 33 key-value pairs and 579 tensors from /root/.ollama/models/blobs/sha256-3d56bdc5fb9286615ef0f4ab59e1471fbe47d43d67fa7be46efb725cc9650add (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = qwen2
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = Qwen2.5 Coder 14B
llama_model_loader: - kv   3:                           general.basename str              = Qwen2.5-Coder
llama_model_loader: - kv   4:                         general.size_label str              = 14B
llama_model_loader: - kv   5:                            general.license str              = apache-2.0
llama_model_loader: - kv   6:                       general.license.link str              = https://huggingface.co/Qwen/Qwen2.5-C...
llama_model_loader: - kv   7:                   general.base_model.count u32              = 1
llama_model_loader: - kv   8:                  general.base_model.0.name str              = Qwen2.5 14B
llama_model_loader: - kv   9:          general.base_model.0.organization str              = Qwen
llama_model_loader: - kv  10:              general.base_model.0.repo_url str              = https://huggingface.co/Qwen/Qwen2.5-14B
llama_model_loader: - kv  11:                               general.tags arr[str,5]       = ["code", "qwen", "qwen-coder", "codeq...
llama_model_loader: - kv  12:                          general.languages arr[str,1]       = ["en"]
llama_model_loader: - kv  13:                          qwen2.block_count u32              = 48
llama_model_loader: - kv  14:                       qwen2.context_length u32              = 32768
llama_model_loader: - kv  15:                     qwen2.embedding_length u32              = 5120
llama_model_loader: - kv  16:                  qwen2.feed_forward_length u32              = 13824
llama_model_loader: - kv  17:                 qwen2.attention.head_count u32              = 40
llama_model_loader: - kv  18:              qwen2.attention.head_count_kv u32              = 8
llama_model_loader: - kv  19:                       qwen2.rope.freq_base f32              = 1000000.000000
llama_model_loader: - kv  20:     qwen2.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  21:                          general.file_type u32              = 10
llama_model_loader: - kv  22:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  23:                         tokenizer.ggml.pre str              = qwen2
llama_model_loader: - kv  24:                      tokenizer.ggml.tokens arr[str,152064]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  25:                  tokenizer.ggml.token_type arr[i32,152064]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  26:                      tokenizer.ggml.merges arr[str,151387]  = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv  27:                tokenizer.ggml.eos_token_id u32              = 151645
llama_model_loader: - kv  28:            tokenizer.ggml.padding_token_id u32              = 151643
llama_model_loader: - kv  29:                tokenizer.ggml.bos_token_id u32              = 151643
llama_model_loader: - kv  30:               tokenizer.ggml.add_bos_token bool             = false
llama_model_loader: - kv  31:                    tokenizer.chat_template str              = {%- if tools %}\n    {{- '<|im_start|>...
llama_model_loader: - kv  32:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:  241 tensors
llama_model_loader: - type q2_K:  193 tensors
llama_model_loader: - type q3_K:   96 tensors
llama_model_loader: - type q4_K:   48 tensors
llama_model_loader: - type q6_K:    1 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q2_K - Medium
print_info: file size   = 5.37 GiB (3.12 BPW) 
load: special tokens cache size = 22
load: token to piece cache size = 0.9310 MB
print_info: arch             = qwen2
print_info: vocab_only       = 1
print_info: model type       = ?B
print_info: model params     = 14.77 B
print_info: general.name     = Qwen2.5 Coder 14B
print_info: vocab type       = BPE
print_info: n_vocab          = 152064
print_info: n_merges         = 151387
print_info: BOS token        = 151643 '<|endoftext|>'
print_info: EOS token        = 151645 '<|im_end|>'
print_info: EOT token        = 151645 '<|im_end|>'
print_info: PAD token        = 151643 '<|endoftext|>'
print_info: LF token         = 198 'Ċ'
print_info: FIM PRE token    = 151659 '<|fim_prefix|>'
print_info: FIM SUF token    = 151661 '<|fim_suffix|>'
print_info: FIM MID token    = 151660 '<|fim_middle|>'
print_info: FIM PAD token    = 151662 '<|fim_pad|>'
print_info: FIM REP token    = 151663 '<|repo_name|>'
print_info: FIM SEP token    = 151664 '<|file_sep|>'
print_info: EOG token        = 151643 '<|endoftext|>'
print_info: EOG token        = 151645 '<|im_end|>'
print_info: EOG token        = 151662 '<|fim_pad|>'
print_info: EOG token        = 151663 '<|repo_name|>'
print_info: EOG token        = 151664 '<|file_sep|>'
print_info: max token length = 256
llama_model_load: vocab only - skipping tensors
time=2025-08-02T17:02:53.318-03:00 level=INFO source=server.go:458 msg="starting llama server" cmd="/home/davi/anaconda3/envs/ipex-llm/lib/python3.11/site-packages/bigdl/cpp/libs/ollama/ollama-lib runner --model /root/.ollama/models/blobs/sha256-3d56bdc5fb9286615ef0f4ab59e1471fbe47d43d67fa7be46efb725cc9650add --ctx-size 4096 --batch-size 512 --n-gpu-layers 999 --threads 8 --no-mmap --parallel 2 --port 42995"
time=2025-08-02T17:02:53.318-03:00 level=INFO source=sched.go:483 msg="loaded runners" count=1
time=2025-08-02T17:02:53.318-03:00 level=INFO source=server.go:618 msg="waiting for llama runner to start responding"
time=2025-08-02T17:02:53.319-03:00 level=INFO source=server.go:652 msg="waiting for server to become available" status="llm server not responding"
using override patterns: []
time=2025-08-02T17:02:53.364-03:00 level=INFO source=runner.go:851 msg="starting go runner"
load_backend: loaded SYCL backend from /home/davi/anaconda3/envs/ipex-llm/lib/python3.11/site-packages/bigdl/cpp/libs/ollama/libggml-sycl.so
load_backend: loaded CPU backend from /home/davi/anaconda3/envs/ipex-llm/lib/python3.11/site-packages/bigdl/cpp/libs/ollama/libggml-cpu-haswell.so
time=2025-08-02T17:02:53.425-03:00 level=INFO source=ggml.go:104 msg=system CPU.0.LLAMAFILE=1 CPU.0.OPENMP=1 CPU.0.AARCH64_REPACK=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc)
time=2025-08-02T17:02:53.425-03:00 level=INFO source=runner.go:911 msg="Server listening on 127.0.0.1:42995"
llama_model_load_from_file_impl: using device SYCL0 (Intel(R) Arc(TM) B580 Graphics) - 11345 MiB free
llama_model_loader: loaded meta data with 33 key-value pairs and 579 tensors from /root/.ollama/models/blobs/sha256-3d56bdc5fb9286615ef0f4ab59e1471fbe47d43d67fa7be46efb725cc9650add (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = qwen2
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = Qwen2.5 Coder 14B
llama_model_loader: - kv   3:                           general.basename str              = Qwen2.5-Coder
llama_model_loader: - kv   4:                         general.size_label str              = 14B
llama_model_loader: - kv   5:                            general.license str              = apache-2.0
llama_model_loader: - kv   6:                       general.license.link str              = https://huggingface.co/Qwen/Qwen2.5-C...
llama_model_loader: - kv   7:                   general.base_model.count u32              = 1
llama_model_loader: - kv   8:                  general.base_model.0.name str              = Qwen2.5 14B
llama_model_loader: - kv   9:          general.base_model.0.organization str              = Qwen
llama_model_loader: - kv  10:              general.base_model.0.repo_url str              = https://huggingface.co/Qwen/Qwen2.5-14B
llama_model_loader: - kv  11:                               general.tags arr[str,5]       = ["code", "qwen", "qwen-coder", "codeq...
llama_model_loader: - kv  12:                          general.languages arr[str,1]       = ["en"]
llama_model_loader: - kv  13:                          qwen2.block_count u32              = 48
llama_model_loader: - kv  14:                       qwen2.context_length u32              = 32768
llama_model_loader: - kv  15:                     qwen2.embedding_length u32              = 5120
llama_model_loader: - kv  16:                  qwen2.feed_forward_length u32              = 13824
llama_model_loader: - kv  17:                 qwen2.attention.head_count u32              = 40
llama_model_loader: - kv  18:              qwen2.attention.head_count_kv u32              = 8
llama_model_loader: - kv  19:                       qwen2.rope.freq_base f32              = 1000000.000000
llama_model_loader: - kv  20:     qwen2.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  21:                          general.file_type u32              = 10
llama_model_loader: - kv  22:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  23:                         tokenizer.ggml.pre str              = qwen2
llama_model_loader: - kv  24:                      tokenizer.ggml.tokens arr[str,152064]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  25:                  tokenizer.ggml.token_type arr[i32,152064]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  26:                      tokenizer.ggml.merges arr[str,151387]  = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv  27:                tokenizer.ggml.eos_token_id u32              = 151645
llama_model_loader: - kv  28:            tokenizer.ggml.padding_token_id u32              = 151643
llama_model_loader: - kv  29:                tokenizer.ggml.bos_token_id u32              = 151643
llama_model_loader: - kv  30:               tokenizer.ggml.add_bos_token bool             = false
llama_model_loader: - kv  31:                    tokenizer.chat_template str              = {%- if tools %}\n    {{- '<|im_start|>...
llama_model_loader: - kv  32:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:  241 tensors
llama_model_loader: - type q2_K:  193 tensors
llama_model_loader: - type q3_K:   96 tensors
llama_model_loader: - type q4_K:   48 tensors
llama_model_loader: - type q6_K:    1 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q2_K - Medium
print_info: file size   = 5.37 GiB (3.12 BPW) 
time=2025-08-02T17:02:53.570-03:00 level=INFO source=server.go:652 msg="waiting for server to become available" status="llm server loading model"
load: special tokens cache size = 22
load: token to piece cache size = 0.9310 MB
print_info: arch             = qwen2
print_info: vocab_only       = 0
print_info: n_ctx_train      = 32768
print_info: n_embd           = 5120
print_info: n_layer          = 48
print_info: n_head           = 40
print_info: n_head_kv        = 8
print_info: n_rot            = 128
print_info: n_swa            = 0
print_info: n_swa_pattern    = 1
print_info: n_embd_head_k    = 128
print_info: n_embd_head_v    = 128
print_info: n_gqa            = 5
print_info: n_embd_k_gqa     = 1024
print_info: n_embd_v_gqa     = 1024
print_info: f_norm_eps       = 0.0e+00
print_info: f_norm_rms_eps   = 1.0e-05
print_info: f_clamp_kqv      = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale    = 0.0e+00
print_info: f_attn_scale     = 0.0e+00
print_info: n_ff             = 13824
print_info: n_expert         = 0
print_info: n_expert_used    = 0
print_info: causal attn      = 1
print_info: pooling type     = -1
print_info: rope type        = 2
print_info: rope scaling     = linear
print_info: freq_base_train  = 1000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn  = 32768
print_info: rope_finetuned   = unknown
print_info: ssm_d_conv       = 0
print_info: ssm_d_inner      = 0
print_info: ssm_d_state      = 0
print_info: ssm_dt_rank      = 0
print_info: ssm_dt_b_c_rms   = 0
print_info: model type       = 14B
print_info: model params     = 14.77 B
print_info: general.name     = Qwen2.5 Coder 14B
print_info: vocab type       = BPE
print_info: n_vocab          = 152064
print_info: n_merges         = 151387
print_info: BOS token        = 151643 '<|endoftext|>'
print_info: EOS token        = 151645 '<|im_end|>'
print_info: EOT token        = 151645 '<|im_end|>'
print_info: PAD token        = 151643 '<|endoftext|>'
print_info: LF token         = 198 'Ċ'
print_info: FIM PRE token    = 151659 '<|fim_prefix|>'
print_info: FIM SUF token    = 151661 '<|fim_suffix|>'
print_info: FIM MID token    = 151660 '<|fim_middle|>'
print_info: FIM PAD token    = 151662 '<|fim_pad|>'
print_info: FIM REP token    = 151663 '<|repo_name|>'
print_info: FIM SEP token    = 151664 '<|file_sep|>'
print_info: EOG token        = 151643 '<|endoftext|>'
print_info: EOG token        = 151645 '<|im_end|>'
print_info: EOG token        = 151662 '<|fim_pad|>'
print_info: EOG token        = 151663 '<|repo_name|>'
print_info: EOG token        = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false)
load_tensors: offloading 48 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 49/49 layers to GPU
load_tensors:          CPU model buffer size =   243.63 MiB
load_tensors:        SYCL0 model buffer size =  5288.54 MiB
ggml-backend.cpp:265: GGML_ASSERT(offset + size <= ggml_nbytes(tensor) && "tensor write out of bounds") failed
[New LWP 22089]
[New LWP 22088]
[New LWP 22080]
[New LWP 22079]
[New LWP 22078]
[New LWP 22077]
[New LWP 22076]
[New LWP 22075]
[New LWP 22074]
[New LWP 22073]
[New LWP 22072]
[New LWP 22071]
warning: Missing auto-load script at offset 0 in section .debug_gdb_scripts
of file /home/davi/anaconda3/envs/ipex-llm/lib/python3.11/site-packages/bigdl/cpp/libs/ollama/ollama-lib.
Use `info auto-load python-scripts [REGEXP]' to list them.
Function(s) ^std::(move|forward|as_const|(__)?addressof) will be skipped when stepping.
Function(s) ^std::(shared|unique)_ptr<.*>::(get|operator) will be skipped when stepping.
Function(s) ^std::(basic_string|vector|array|deque|(forward_)?list|(unordered_|flat_)?(multi)?(map|set)|span)<.*>::(c?r?(begin|end)|front|back|data|size|empty) will be skipped when stepping.
Function(s) ^std::(basic_string|vector|array|deque|span)<.*>::operator.] will be skipped when stepping.
time=2025-08-02T17:03:02.039-03:00 level=INFO source=server.go:652 msg="waiting for server to become available" status="llm server not responding"
warning: File "/opt/intel/oneapi/compiler/2025.2/lib/libsycl.so.8.0.0-gdb.py" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
	add-auto-load-safe-path /opt/intel/oneapi/compiler/2025.2/lib/libsycl.so.8.0.0-gdb.py
line to your configuration file "/root/.config/gdb/gdbinit".
To completely disable this security protection add
	set auto-load safe-path /
line to your configuration file "/root/.config/gdb/gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
	info "(gdb)Auto-loading safe path"
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
runtime.futex () at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/sys_linux_amd64.s:558
warning: 558	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/sys_linux_amd64.s: No such file or directory
#0  runtime.futex () at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/sys_linux_amd64.s:558
558	in /root/go/pkg/mod/golang.org/[email protected]/src/runtime/sys_linux_amd64.s
#1  0x0000000000460670 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=4866275) at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/os_linux.go:75
warning: 75	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/os_linux.go: No such file or directory
#2  0x000000000043c707 in runtime.notesleep (n=0x20e58a0 <runtime.m0+320>) at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/lock_futex.go:47
warning: 47	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/lock_futex.go: No such file or directory
#3  0x000000000046be2c in runtime.mPark () at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:1887
warning: 1887	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go: No such file or directory
#4  runtime.stopm () at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:2907
2907	in /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go
#5  0x000000000046d8fc in runtime.findRunnable (gp=<optimized out>, inheritTime=<optimized out>, tryWakeP=<optimized out>) at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:3644
3644	in /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go
#6  0x000000000046e9f1 in runtime.schedule () at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:4017
4017	in /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go
#7  0x000000000046eea5 in runtime.park_m (gp=0xc000103dc0) at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:4141
4141	in /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go
#8  0x00000000004a02ae in runtime.mcall () at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:459
warning: 459	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s: No such file or directory
#9  0x00007ffe75607f68 in ?? ()
#10 0x00000000004a4cbf in runtime.newproc (fn=0x4a01af <runtime.rt0_go+303>) at <autogenerated>:1
warning: 1	<autogenerated>: No such file or directory
#11 0x00000000004a0225 in runtime.mstart () at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:395
warning: 395	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s: No such file or directory
#12 0x00000000004a01af in runtime.rt0_go () at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:358
358	in /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s
#13 0x0000000000000011 in ?? ()
#14 0x00007ffe756080c8 in ?? ()
#15 0x0000000000000009 in ?? ()
#16 0x0000000000000011 in ?? ()
#17 0x00007ffe756080c8 in ?? ()
#18 0x00007804e6e2a578 in __libc_start_call_main (main=0x0, argc=0, argv=0x0) at ../sysdeps/nptl/libc_start_call_main.h:58
warning: 58	../sysdeps/nptl/libc_start_call_main.h: No such file or directory
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
[Inferior 1 (process 22070) detached]
SIGABRT: abort
PC=0x7804e6ea49bc m=9 sigcode=18446744073709551610
signal arrived during cgo execution

goroutine 50 gp=0xc000102fc0 m=9 mp=0xc000580008 [syscall]:
runtime.cgocall(0x1159a80, 0xc0003b7890)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/cgocall.go:167 +0x4b fp=0xc0003b7868 sp=0xc0003b7830 pc=0x49780b
github.com/ollama/ollama/llama._Cfunc_llama_model_load_from_file(0x780460000b70, {0x0, 0x0, 0x3e7, 0x1, 0x0, 0x0, 0x1159230, 0xc000591028, 0x0, ...})
	_cgo_gotypes.go:876 +0x47 fp=0xc0003b7890 sp=0xc0003b7868 pc=0x846047
github.com/ollama/ollama/llama.LoadModelFromFile.func4(...)
	/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/llama/llama.go:296
github.com/ollama/ollama/llama.LoadModelFromFile({0x7ffe7560909c, 0x62}, {0x3e7, 0x0, 0x0, {0x0, 0x0, 0x0}, 0xc0005961d0, 0x0, ...})
	/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/llama/llama.go:296 +0x4d7 fp=0xc0003b7d80 sp=0xc0003b7890 pc=0x848277
github.com/ollama/ollama/runner/llamarunner.(*Server).loadModel(0xc000326000, {0x3e7, 0x0, 0x0, {0x0, 0x0, 0x0}, 0xc0005961d0, 0x0, {0x0, ...}}, ...)
	/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/runner/llamarunner/runner.go:749 +0x9e fp=0xc0003b7ee8 sp=0xc0003b7d80 pc=0x90503e
github.com/ollama/ollama/runner/llamarunner.Execute.gowrap1()
	/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/runner/llamarunner/runner.go:885 +0x115 fp=0xc0003b7fe0 sp=0xc0003b7ee8 pc=0x906b15
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0003b7fe8 sp=0xc0003b7fe0 pc=0x4a22e1
created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1
	/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/runner/llamarunner/runner.go:885 +0xd2a

goroutine 1 gp=0xc000002380 m=nil [IO wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc00005d5b8 sp=0xc00005d598 pc=0x49ac8e
runtime.netpollblock(0xc00005d608?, 0x433da6?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/netpoll.go:575 +0xf7 fp=0xc00005d5f0 sp=0xc00005d5b8 pc=0x45f957
internal/poll.runtime_pollWait(0x7804e7cc8de0, 0x72)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/netpoll.go:351 +0x85 fp=0xc00005d610 sp=0xc00005d5f0 pc=0x499ea5
internal/poll.(*pollDesc).wait(0xc0005be180?, 0x900000036?, 0x0)
	/root/go/pkg/mod/golang.org/[email protected]/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00005d638 sp=0xc00005d610 pc=0x5211c7
internal/poll.(*pollDesc).waitRead(...)
	/root/go/pkg/mod/golang.org/[email protected]/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0xc0005be180)
	/root/go/pkg/mod/golang.org/[email protected]/src/internal/poll/fd_unix.go:620 +0x295 fp=0xc00005d6e0 sp=0xc00005d638 pc=0x526595
net.(*netFD).accept(0xc0005be180)
	/root/go/pkg/mod/golang.org/[email protected]/src/net/fd_unix.go:172 +0x29 fp=0xc00005d798 sp=0xc00005d6e0 pc=0x598aa9
net.(*TCPListener).accept(0xc00059a180)
	/root/go/pkg/mod/golang.org/[email protected]/src/net/tcpsock_posix.go:159 +0x1b fp=0xc00005d7e8 sp=0xc00005d798 pc=0x5ae41b
net.(*TCPListener).Accept(0xc00059a180)
	/root/go/pkg/mod/golang.org/[email protected]/src/net/tcpsock.go:380 +0x30 fp=0xc00005d818 sp=0xc00005d7e8 pc=0x5ad2d0
net/http.(*onceCloseListener).Accept(0xc00017acf0?)
	<autogenerated>:1 +0x24 fp=0xc00005d830 sp=0xc00005d818 pc=0x7c4964
net/http.(*Server).Serve(0xc0001f6600, {0x15f1be8, 0xc00059a180})
	/root/go/pkg/mod/golang.org/[email protected]/src/net/http/server.go:3424 +0x30c fp=0xc00005d960 sp=0xc00005d830 pc=0x79c22c
github.com/ollama/ollama/runner/llamarunner.Execute({0xc000134020, 0xf, 0x10})
	/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/runner/llamarunner/runner.go:912 +0x11e9 fp=0xc00005dd08 sp=0xc00005d960 pc=0x906709
github.com/ollama/ollama/runner.Execute({0xc000134010?, 0x0?, 0x0?})
	/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/runner/runner.go:22 +0xd4 fp=0xc00005dd30 sp=0xc00005dd08 pc=0x98b474
github.com/ollama/ollama/cmd.NewCLI.func2(0xc0000d0f00?, {0x141a6a2?, 0x4?, 0x141a6a6?})
	/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/cmd/cmd.go:1529 +0x45 fp=0xc00005dd58 sp=0xc00005dd30 pc=0x10e7c05
github.com/spf13/cobra.(*Command).execute(0xc000230f08, {0xc0000d4ff0, 0xf, 0xf})
	/root/go/pkg/mod/github.com/spf13/[email protected]/command.go:940 +0x85c fp=0xc00005de78 sp=0xc00005dd58 pc=0x6120bc
github.com/spf13/cobra.(*Command).ExecuteC(0xc0000fe908)
	/root/go/pkg/mod/github.com/spf13/[email protected]/command.go:1068 +0x3a5 fp=0xc00005df30 sp=0xc00005de78 pc=0x612905
github.com/spf13/cobra.(*Command).Execute(...)
	/root/go/pkg/mod/github.com/spf13/[email protected]/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
	/root/go/pkg/mod/github.com/spf13/[email protected]/command.go:985
main.main()
	/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/main.go:12 +0x4d fp=0xc00005df50 sp=0xc00005df30 pc=0x10e868d
runtime.main()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:283 +0x28b fp=0xc00005dfe0 sp=0xc00005df50 pc=0x466f6b
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00005dfe8 sp=0xc00005dfe0 pc=0x4a22e1

goroutine 2 gp=0xc000002e00 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000094fa8 sp=0xc000094f88 pc=0x49ac8e
runtime.goparkunlock(...)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:441
runtime.forcegchelper()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:348 +0xb3 fp=0xc000094fe0 sp=0xc000094fa8 pc=0x4672b3
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000094fe8 sp=0xc000094fe0 pc=0x4a22e1
created by runtime.init.7 in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:336 +0x1a

goroutine 3 gp=0xc000003340 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000095780 sp=0xc000095760 pc=0x49ac8e
runtime.goparkunlock(...)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:441
runtime.bgsweep(0xc0000c0000)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgcsweep.go:316 +0xdf fp=0xc0000957c8 sp=0xc000095780 pc=0x451adf
runtime.gcenable.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:204 +0x25 fp=0xc0000957e0 sp=0xc0000957c8 pc=0x445f45
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000957e8 sp=0xc0000957e0 pc=0x4a22e1
created by runtime.gcenable in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:204 +0x66

goroutine 4 gp=0xc000003500 m=nil [GC scavenge wait]:
runtime.gopark(0x10000?, 0x15df218?, 0x0?, 0x0?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000095f78 sp=0xc000095f58 pc=0x49ac8e
runtime.goparkunlock(...)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:441
runtime.(*scavengerState).park(0x20e2940)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc000095fa8 sp=0xc000095f78 pc=0x44f529
runtime.bgscavenge(0xc0000c0000)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgcscavenge.go:658 +0x59 fp=0xc000095fc8 sp=0xc000095fa8 pc=0x44fab9
runtime.gcenable.gowrap2()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:205 +0x25 fp=0xc000095fe0 sp=0xc000095fc8 pc=0x445ee5
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000095fe8 sp=0xc000095fe0 pc=0x4a22e1
created by runtime.gcenable in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:205 +0xa5

goroutine 18 gp=0xc000102700 m=nil [finalizer wait]:
runtime.gopark(0x1b8?, 0xc000002380?, 0x1?, 0x23?, 0xc000094688?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000094630 sp=0xc000094610 pc=0x49ac8e
runtime.runfinq()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mfinal.go:196 +0x107 fp=0xc0000947e0 sp=0xc000094630 pc=0x444f07
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000947e8 sp=0xc0000947e0 pc=0x4a22e1
created by runtime.createfing in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mfinal.go:166 +0x3d

goroutine 19 gp=0xc000103180 m=nil [chan receive]:
runtime.gopark(0xc00022d4a0?, 0xc000490048?, 0x60?, 0x7?, 0x57f7e8?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000090718 sp=0xc0000906f8 pc=0x49ac8e
runtime.chanrecv(0xc000110380, 0x0, 0x1)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/chan.go:664 +0x445 fp=0xc000090790 sp=0xc000090718 pc=0x436925
runtime.chanrecv1(0x0?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/chan.go:506 +0x12 fp=0xc0000907b8 sp=0xc000090790 pc=0x4364b2
runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1796
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1799 +0x2f fp=0xc0000907e0 sp=0xc0000907b8 pc=0x44908f
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000907e8 sp=0xc0000907e0 pc=0x4a22e1
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1794 +0x79

goroutine 20 gp=0xc000103500 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000090f38 sp=0xc000090f18 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc000090fc8 sp=0xc000090f38 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc000090fe0 sp=0xc000090fc8 pc=0x448285
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000090fe8 sp=0xc000090fe0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105

goroutine 34 gp=0xc000484000 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc00048a738 sp=0xc00048a718 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc00048a7c8 sp=0xc00048a738 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc00048a7e0 sp=0xc00048a7c8 pc=0x448285
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00048a7e8 sp=0xc00048a7e0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105

goroutine 5 gp=0xc000003a40 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000096738 sp=0xc000096718 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc0000967c8 sp=0xc000096738 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc0000967e0 sp=0xc0000967c8 pc=0x448285
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000967e8 sp=0xc0000967e0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105

goroutine 6 gp=0xc000003c00 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000096f38 sp=0xc000096f18 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc000096fc8 sp=0xc000096f38 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc000096fe0 sp=0xc000096fc8 pc=0x448285
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000096fe8 sp=0xc000096fe0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105

goroutine 21 gp=0xc0001036c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000091738 sp=0xc000091718 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc0000917c8 sp=0xc000091738 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc0000917e0 sp=0xc0000917c8 pc=0x448285
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000917e8 sp=0xc0000917e0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105

goroutine 35 gp=0xc0004841c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc00048af38 sp=0xc00048af18 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc00048afc8 sp=0xc00048af38 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc00048afe0 sp=0xc00048afc8 pc=0x448285
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00048afe8 sp=0xc00048afe0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105

goroutine 7 gp=0xc000003dc0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000097738 sp=0xc000097718 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc0000977c8 sp=0xc000097738 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc0000977e0 sp=0xc0000977c8 pc=0x448285
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000977e8 sp=0xc0000977e0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105

goroutine 22 gp=0xc000103880 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000091f38 sp=0xc000091f18 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc000091fc8 sp=0xc000091f38 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc000091fe0 sp=0xc000091fc8 pc=0x448285
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000091fe8 sp=0xc000091fe0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105

goroutine 36 gp=0xc000484380 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc00048b738 sp=0xc00048b718 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc00048b7c8 sp=0xc00048b738 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc00048b7e0 sp=0xc00048b7c8 pc=0x448285
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00048b7e8 sp=0xc00048b7e0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105

goroutine 8 gp=0xc0000ce000 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000097f38 sp=0xc000097f18 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc000097fc8 sp=0xc000097f38 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc000097fe0 sp=0xc000097fc8 pc=0x448285
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000097fe8 sp=0xc000097fe0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105

goroutine 9 gp=0xc0000ce1c0 m=nil [GC worker (idle)]:
runtime.gopark(0x2190920?, 0x1?, 0xc9?, 0xa8?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000486738 sp=0xc000486718 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc0004867c8 sp=0xc000486738 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc0004867e0 sp=0xc0004867c8 pc=0x448285
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0004867e8 sp=0xc0004867e0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105

goroutine 10 gp=0xc0000ce380 m=nil [GC worker (idle)]:
runtime.gopark(0xb1b4666682e?, 0x1?, 0xf?, 0x26?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000486f38 sp=0xc000486f18 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc000486fc8 sp=0xc000486f38 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc000486fe0 sp=0xc000486fc8 pc=0x448285
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000486fe8 sp=0xc000486fe0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105

goroutine 11 gp=0xc0000ce540 m=nil [GC worker (idle)]:
runtime.gopark(0xb1b46671783?, 0x3?, 0xcb?, 0x6e?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000487738 sp=0xc000487718 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc0004877c8 sp=0xc000487738 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc0004877e0 sp=0xc0004877c8 pc=0x448285
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0004877e8 sp=0xc0004877e0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105

goroutine 12 gp=0xc0000ce700 m=nil [GC worker (idle)]:
runtime.gopark(0x2190920?, 0x1?, 0x19?, 0x8c?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000487f38 sp=0xc000487f18 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc000487fc8 sp=0xc000487f38 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc000487fe0 sp=0xc000487fc8 pc=0x448285
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000487fe8 sp=0xc000487fe0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105

goroutine 13 gp=0xc0000ce8c0 m=nil [GC worker (idle)]:
runtime.gopark(0x2190920?, 0x1?, 0x52?, 0xfd?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000488738 sp=0xc000488718 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc0004887c8 sp=0xc000488738 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc0004887e0 sp=0xc0004887c8 pc=0x448285
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0004887e8 sp=0xc0004887e0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105

goroutine 37 gp=0xc000484540 m=nil [GC worker (idle)]:
runtime.gopark(0xb1b46672124?, 0x1?, 0x57?, 0xfc?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc00048bf38 sp=0xc00048bf18 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc00048bfc8 sp=0xc00048bf38 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc00048bfe0 sp=0xc00048bfc8 pc=0x448285
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00048bfe8 sp=0xc00048bfe0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105

goroutine 51 gp=0xc000103340 m=nil [sync.WaitGroup.Wait]:
runtime.gopark(0x0?, 0x0?, 0x20?, 0x3f?, 0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc00048ce20 sp=0xc00048ce00 pc=0x49ac8e
runtime.goparkunlock(...)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:441
runtime.semacquire1(0xc000326008, 0x0, 0x1, 0x0, 0x18)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/sema.go:188 +0x21d fp=0xc00048ce88 sp=0xc00048ce20 pc=0x47a47d
sync.runtime_SemacquireWaitGroup(0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/sema.go:110 +0x25 fp=0xc00048cec0 sp=0xc00048ce88 pc=0x49c685
sync.(*WaitGroup).Wait(0x0?)
	/root/go/pkg/mod/golang.org/[email protected]/src/sync/waitgroup.go:118 +0x48 fp=0xc00048cee8 sp=0xc00048cec0 pc=0x4adc28
github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc000326000, {0x15f4210, 0xc00017ceb0})
	/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/runner/llamarunner/runner.go:314 +0x47 fp=0xc00048cfb8 sp=0xc00048cee8 pc=0x901d67
github.com/ollama/ollama/runner/llamarunner.Execute.gowrap2()
	/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/runner/llamarunner/runner.go:892 +0x28 fp=0xc00048cfe0 sp=0xc00048cfb8 pc=0x9069c8
runtime.goexit({})
	/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00048cfe8 sp=0xc00048cfe0 pc=0x4a22e1
created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1
	/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/runner/llamarunner/runner.go:892 +0xe05

rax    0x0
rbx    0x563e
rcx    0x7804e6ea49bc
rdx    0x6
rdi    0x5636
rsi    0x563e
rbp    0x78047a7fa6e0
rsp    0x78047a7fa6a0
r8     0x0
r9     0x0
r10    0xf11ed7d
r11    0x246
r12    0x6
r13    0x109
r14    0x16
r15    0xaf0000
rip    0x7804e6ea49bc
rflags 0x246
cs     0x33
fs     0x0
gs     0x0
time=2025-08-02T17:03:03.596-03:00 level=ERROR source=server.go:484 msg="llama runner terminated" error="exit status 2"
time=2025-08-02T17:03:03.643-03:00 level=ERROR source=sched.go:489 msg="error loading llama server" error="llama runner process has terminated: GGML_ASSERT(offset + size <= ggml_nbytes(tensor) && \"tensor write out of bounds\") failed"
[GIN] 2025/08/02 - 17:03:03 | 500 | 10.568882015s |       127.0.0.1 | POST     "/api/generate"

WizardlyBump17 avatar Aug 02 '25 20:08 WizardlyBump17

I talked to the developing team ,they just released https://github.com/ipex-llm/ipex-llm/releases/download/v2.3.0-nightly/ollama-ipex-llm-2.3.0b20250725-win.zip , try to see if it works?

Ellie-Williams-007 avatar Aug 05 '25 01:08 Ellie-Williams-007

Could you pls install https://github.com/ipex-llm/ipex-llm/releases/download/v2.3.0-nightly/ollama-ipex-llm-2.3.0b20250725-win.zip and try to see if it works?

Looks like it is the same error.

time=2025-08-05T00:08:42.168-03:00 level=INFO source=server.go:135 msg="system memory" total="22.9 GiB" free="16.9 GiB" free_swap="16.0 GiB"
time=2025-08-05T00:08:42.169-03:00 level=INFO source=server.go:187 msg=offload library=cpu layers.requested=-1 layers.model=49 layers.offload=0 layers.split="" memory.available="[16.9 GiB]" memory.gpu_overhead="0 B" memory.required.full="6.3 GiB" memory.required.partial="0 B" memory.required.kv="768.0 MiB" memory.required.allocations="[6.3 GiB]" memory.weights.total="5.1 GiB" memory.weights.repeating="4.5 GiB" memory.weights.nonrepeating="609.1 MiB" memory.graph.full="348.0 MiB" memory.graph.partial="916.1 MiB"
llama_model_loader: loaded meta data with 33 key-value pairs and 579 tensors from /root/.ollama/models/blobs/sha256-3d56bdc5fb9286615ef0f4ab59e1471fbe47d43d67fa7be46efb725cc9650add (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = qwen2
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = Qwen2.5 Coder 14B
llama_model_loader: - kv   3:                           general.basename str              = Qwen2.5-Coder
llama_model_loader: - kv   4:                         general.size_label str              = 14B
llama_model_loader: - kv   5:                            general.license str              = apache-2.0
llama_model_loader: - kv   6:                       general.license.link str              = https://huggingface.co/Qwen/Qwen2.5-C...
llama_model_loader: - kv   7:                   general.base_model.count u32              = 1
llama_model_loader: - kv   8:                  general.base_model.0.name str              = Qwen2.5 14B
llama_model_loader: - kv   9:          general.base_model.0.organization str              = Qwen
llama_model_loader: - kv  10:              general.base_model.0.repo_url str              = https://huggingface.co/Qwen/Qwen2.5-14B
llama_model_loader: - kv  11:                               general.tags arr[str,5]       = ["code", "qwen", "qwen-coder", "codeq...
llama_model_loader: - kv  12:                          general.languages arr[str,1]       = ["en"]
llama_model_loader: - kv  13:                          qwen2.block_count u32              = 48
llama_model_loader: - kv  14:                       qwen2.context_length u32              = 32768
llama_model_loader: - kv  15:                     qwen2.embedding_length u32              = 5120
llama_model_loader: - kv  16:                  qwen2.feed_forward_length u32              = 13824
llama_model_loader: - kv  17:                 qwen2.attention.head_count u32              = 40
llama_model_loader: - kv  18:              qwen2.attention.head_count_kv u32              = 8
llama_model_loader: - kv  19:                       qwen2.rope.freq_base f32              = 1000000.000000
llama_model_loader: - kv  20:     qwen2.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  21:                          general.file_type u32              = 10
llama_model_loader: - kv  22:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  23:                         tokenizer.ggml.pre str              = qwen2
llama_model_loader: - kv  24:                      tokenizer.ggml.tokens arr[str,152064]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  25:                  tokenizer.ggml.token_type arr[i32,152064]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  26:                      tokenizer.ggml.merges arr[str,151387]  = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv  27:                tokenizer.ggml.eos_token_id u32              = 151645
llama_model_loader: - kv  28:            tokenizer.ggml.padding_token_id u32              = 151643
llama_model_loader: - kv  29:                tokenizer.ggml.bos_token_id u32              = 151643
llama_model_loader: - kv  30:               tokenizer.ggml.add_bos_token bool             = false
llama_model_loader: - kv  31:                    tokenizer.chat_template str              = {%- if tools %}\n    {{- '<|im_start|>...
llama_model_loader: - kv  32:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:  241 tensors
llama_model_loader: - type q2_K:  193 tensors
llama_model_loader: - type q3_K:   96 tensors
llama_model_loader: - type q4_K:   48 tensors
llama_model_loader: - type q6_K:    1 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q2_K - Medium
print_info: file size   = 5.37 GiB (3.12 BPW) 
load: special tokens cache size = 22
load: token to piece cache size = 0.9310 MB
print_info: arch             = qwen2
print_info: vocab_only       = 1
print_info: model type       = ?B
print_info: model params     = 14.77 B
print_info: general.name     = Qwen2.5 Coder 14B
print_info: vocab type       = BPE
print_info: n_vocab          = 152064
print_info: n_merges         = 151387
print_info: BOS token        = 151643 '<|endoftext|>'
print_info: EOS token        = 151645 '<|im_end|>'
print_info: EOT token        = 151645 '<|im_end|>'
print_info: PAD token        = 151643 '<|endoftext|>'
print_info: LF token         = 198 'Ċ'
print_info: FIM PRE token    = 151659 '<|fim_prefix|>'
print_info: FIM SUF token    = 151661 '<|fim_suffix|>'
print_info: FIM MID token    = 151660 '<|fim_middle|>'
print_info: FIM PAD token    = 151662 '<|fim_pad|>'
print_info: FIM REP token    = 151663 '<|repo_name|>'
print_info: FIM SEP token    = 151664 '<|file_sep|>'
print_info: EOG token        = 151643 '<|endoftext|>'
print_info: EOG token        = 151645 '<|im_end|>'
print_info: EOG token        = 151662 '<|fim_pad|>'
print_info: EOG token        = 151663 '<|repo_name|>'
print_info: EOG token        = 151664 '<|file_sep|>'
print_info: max token length = 256
llama_model_load: vocab only - skipping tensors
time=2025-08-05T00:08:42.350-03:00 level=INFO source=server.go:458 msg="starting llama server" cmd="/home/davi/AI/ollama-ipex/ollama-ipex-llm-2.3.0b20250725-ubuntu/ollama-bin runner --model /root/.ollama/models/blobs/sha256-3d56bdc5fb9286615ef0f4ab59e1471fbe47d43d67fa7be46efb725cc9650add --ctx-size 4096 --batch-size 512 --n-gpu-layers 999 --threads 8 --no-mmap --parallel 2 --port 38897"
time=2025-08-05T00:08:42.350-03:00 level=INFO source=sched.go:483 msg="loaded runners" count=1
time=2025-08-05T00:08:42.350-03:00 level=INFO source=server.go:618 msg="waiting for llama runner to start responding"
time=2025-08-05T00:08:42.351-03:00 level=INFO source=server.go:652 msg="waiting for server to become available" status="llm server not responding"
using override patterns: []
time=2025-08-05T00:08:42.397-03:00 level=INFO source=runner.go:851 msg="starting go runner"
load_backend: loaded SYCL backend from /home/davi/AI/ollama-ipex/ollama-ipex-llm-2.3.0b20250725-ubuntu/libggml-sycl.so
load_backend: loaded CPU backend from /home/davi/AI/ollama-ipex/ollama-ipex-llm-2.3.0b20250725-ubuntu/libggml-cpu-haswell.so
time=2025-08-05T00:08:42.446-03:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.0.OPENMP=1 CPU.0.AARCH64_REPACK=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc)
time=2025-08-05T00:08:42.446-03:00 level=INFO source=runner.go:911 msg="Server listening on 127.0.0.1:38897"
llama_model_load_from_file_impl: using device SYCL0 (Intel(R) Arc(TM) B580 Graphics) - 11241 MiB free
llama_model_loader: loaded meta data with 33 key-value pairs and 579 tensors from /root/.ollama/models/blobs/sha256-3d56bdc5fb9286615ef0f4ab59e1471fbe47d43d67fa7be46efb725cc9650add (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = qwen2
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = Qwen2.5 Coder 14B
llama_model_loader: - kv   3:                           general.basename str              = Qwen2.5-Coder
llama_model_loader: - kv   4:                         general.size_label str              = 14B
llama_model_loader: - kv   5:                            general.license str              = apache-2.0
llama_model_loader: - kv   6:                       general.license.link str              = https://huggingface.co/Qwen/Qwen2.5-C...
llama_model_loader: - kv   7:                   general.base_model.count u32              = 1
llama_model_loader: - kv   8:                  general.base_model.0.name str              = Qwen2.5 14B
llama_model_loader: - kv   9:          general.base_model.0.organization str              = Qwen
llama_model_loader: - kv  10:              general.base_model.0.repo_url str              = https://huggingface.co/Qwen/Qwen2.5-14B
llama_model_loader: - kv  11:                               general.tags arr[str,5]       = ["code", "qwen", "qwen-coder", "codeq...
llama_model_loader: - kv  12:                          general.languages arr[str,1]       = ["en"]
llama_model_loader: - kv  13:                          qwen2.block_count u32              = 48
llama_model_loader: - kv  14:                       qwen2.context_length u32              = 32768
llama_model_loader: - kv  15:                     qwen2.embedding_length u32              = 5120
llama_model_loader: - kv  16:                  qwen2.feed_forward_length u32              = 13824
llama_model_loader: - kv  17:                 qwen2.attention.head_count u32              = 40
llama_model_loader: - kv  18:              qwen2.attention.head_count_kv u32              = 8
llama_model_loader: - kv  19:                       qwen2.rope.freq_base f32              = 1000000.000000
llama_model_loader: - kv  20:     qwen2.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  21:                          general.file_type u32              = 10
llama_model_loader: - kv  22:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  23:                         tokenizer.ggml.pre str              = qwen2
llama_model_loader: - kv  24:                      tokenizer.ggml.tokens arr[str,152064]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  25:                  tokenizer.ggml.token_type arr[i32,152064]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  26:                      tokenizer.ggml.merges arr[str,151387]  = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv  27:                tokenizer.ggml.eos_token_id u32              = 151645
llama_model_loader: - kv  28:            tokenizer.ggml.padding_token_id u32              = 151643
llama_model_loader: - kv  29:                tokenizer.ggml.bos_token_id u32              = 151643
llama_model_loader: - kv  30:               tokenizer.ggml.add_bos_token bool             = false
llama_model_loader: - kv  31:                    tokenizer.chat_template str              = {%- if tools %}\n    {{- '<|im_start|>...
llama_model_loader: - kv  32:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:  241 tensors
llama_model_loader: - type q2_K:  193 tensors
llama_model_loader: - type q3_K:   96 tensors
llama_model_loader: - type q4_K:   48 tensors
llama_model_loader: - type q6_K:    1 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q2_K - Medium
print_info: file size   = 5.37 GiB (3.12 BPW) 
time=2025-08-05T00:08:42.602-03:00 level=INFO source=server.go:652 msg="waiting for server to become available" status="llm server loading model"
load: special tokens cache size = 22
load: token to piece cache size = 0.9310 MB
print_info: arch             = qwen2
print_info: vocab_only       = 0
print_info: n_ctx_train      = 32768
print_info: n_embd           = 5120
print_info: n_layer          = 48
print_info: n_head           = 40
print_info: n_head_kv        = 8
print_info: n_rot            = 128
print_info: n_swa            = 0
print_info: n_swa_pattern    = 1
print_info: n_embd_head_k    = 128
print_info: n_embd_head_v    = 128
print_info: n_gqa            = 5
print_info: n_embd_k_gqa     = 1024
print_info: n_embd_v_gqa     = 1024
print_info: f_norm_eps       = 0.0e+00
print_info: f_norm_rms_eps   = 1.0e-05
print_info: f_clamp_kqv      = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale    = 0.0e+00
print_info: f_attn_scale     = 0.0e+00
print_info: n_ff             = 13824
print_info: n_expert         = 0
print_info: n_expert_used    = 0
print_info: causal attn      = 1
print_info: pooling type     = -1
print_info: rope type        = 2
print_info: rope scaling     = linear
print_info: freq_base_train  = 1000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn  = 32768
print_info: rope_finetuned   = unknown
print_info: ssm_d_conv       = 0
print_info: ssm_d_inner      = 0
print_info: ssm_d_state      = 0
print_info: ssm_dt_rank      = 0
print_info: ssm_dt_b_c_rms   = 0
print_info: model type       = 14B
print_info: model params     = 14.77 B
print_info: general.name     = Qwen2.5 Coder 14B
print_info: vocab type       = BPE
print_info: n_vocab          = 152064
print_info: n_merges         = 151387
print_info: BOS token        = 151643 '<|endoftext|>'
print_info: EOS token        = 151645 '<|im_end|>'
print_info: EOT token        = 151645 '<|im_end|>'
print_info: PAD token        = 151643 '<|endoftext|>'
print_info: LF token         = 198 'Ċ'
print_info: FIM PRE token    = 151659 '<|fim_prefix|>'
print_info: FIM SUF token    = 151661 '<|fim_suffix|>'
print_info: FIM MID token    = 151660 '<|fim_middle|>'
print_info: FIM PAD token    = 151662 '<|fim_pad|>'
print_info: FIM REP token    = 151663 '<|repo_name|>'
print_info: FIM SEP token    = 151664 '<|file_sep|>'
print_info: EOG token        = 151643 '<|endoftext|>'
print_info: EOG token        = 151645 '<|im_end|>'
print_info: EOG token        = 151662 '<|fim_pad|>'
print_info: EOG token        = 151663 '<|repo_name|>'
print_info: EOG token        = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false)
load_tensors: offloading 48 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 49/49 layers to GPU
load_tensors:          CPU model buffer size =   243.63 MiB
load_tensors:        SYCL0 model buffer size =  5288.54 MiB
ggml-backend.cpp:265: GGML_ASSERT(offset + size <= ggml_nbytes(tensor) && "tensor write out of bounds") failed
[New LWP 10361]
[New LWP 10353]
[New LWP 10352]
[New LWP 10351]
[New LWP 10350]
[New LWP 10349]
[New LWP 10348]
[New LWP 10347]
[New LWP 10346]
[New LWP 10345]
warning: Missing auto-load script at offset 0 in section .debug_gdb_scripts
of file /home/davi/AI/ollama-ipex/ollama-ipex-llm-2.3.0b20250725-ubuntu/ollama-bin.
Use `info auto-load python-scripts [REGEXP]' to list them.
Function(s) ^std::(move|forward|as_const|(__)?addressof) will be skipped when stepping.
Function(s) ^std::(shared|unique)_ptr<.*>::(get|operator) will be skipped when stepping.
Function(s) ^std::(basic_string|vector|array|deque|(forward_)?list|(unordered_|flat_)?(multi)?(map|set)|span)<.*>::(c?r?(begin|end)|front|back|data|size|empty) will be skipped when stepping.
Function(s) ^std::(basic_string|vector|array|deque|span)<.*>::operator.] will be skipped when stepping.
time=2025-08-05T00:08:51.072-03:00 level=INFO source=server.go:652 msg="waiting for server to become available" status="llm server not responding"
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
runtime.futex () at /usr/local/go/src/runtime/sys_linux_amd64.s:558
warning: 558	/usr/local/go/src/runtime/sys_linux_amd64.s: No such file or directory
#0  runtime.futex () at /usr/local/go/src/runtime/sys_linux_amd64.s:558
558	in /usr/local/go/src/runtime/sys_linux_amd64.s
#1  0x000000000044cff0 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=4786851) at /usr/local/go/src/runtime/os_linux.go:75
warning: 75	/usr/local/go/src/runtime/os_linux.go: No such file or directory
#2  0x0000000000429087 in runtime.notesleep (n=0x1ecf900 <runtime.m0+320>) at /usr/local/go/src/runtime/lock_futex.go:47
warning: 47	/usr/local/go/src/runtime/lock_futex.go: No such file or directory
#3  0x00000000004587ac in runtime.mPark () at /usr/local/go/src/runtime/proc.go:1887
warning: 1887	/usr/local/go/src/runtime/proc.go: No such file or directory
#4  runtime.stopm () at /usr/local/go/src/runtime/proc.go:2910
2910	in /usr/local/go/src/runtime/proc.go
#5  0x000000000045a27c in runtime.findRunnable (gp=<optimized out>, inheritTime=<optimized out>, tryWakeP=<optimized out>) at /usr/local/go/src/runtime/proc.go:3647
3647	in /usr/local/go/src/runtime/proc.go
#6  0x000000000045b371 in runtime.schedule () at /usr/local/go/src/runtime/proc.go:4020
4020	in /usr/local/go/src/runtime/proc.go
#7  0x000000000045b825 in runtime.park_m (gp=0xc0001028c0) at /usr/local/go/src/runtime/proc.go:4144
4144	in /usr/local/go/src/runtime/proc.go
#8  0x000000000048cc6e in runtime.mcall () at /usr/local/go/src/runtime/asm_amd64.s:459
warning: 459	/usr/local/go/src/runtime/asm_amd64.s: No such file or directory
#9  0x00007ffd3dfa8388 in ?? ()
#10 0x000000000049167f in runtime.newproc (fn=0x48cb6f <runtime.rt0_go+303>) at <autogenerated>:1
warning: 1	<autogenerated>: No such file or directory
#11 0x000000000048cbe5 in runtime.mstart () at /usr/local/go/src/runtime/asm_amd64.s:395
warning: 395	/usr/local/go/src/runtime/asm_amd64.s: No such file or directory
#12 0x000000000048cb6f in runtime.rt0_go () at /usr/local/go/src/runtime/asm_amd64.s:358
358	in /usr/local/go/src/runtime/asm_amd64.s
#13 0x0000000000000011 in ?? ()
#14 0x00007ffd3dfa84e8 in ?? ()
#15 0x0000000000000006 in ?? ()
#16 0x0000000000000011 in ?? ()
#17 0x00007ffd3dfa84e8 in ?? ()
#18 0x00007cf4cc62a578 in __libc_start_call_main (main=0x0, argc=0, argv=0x0) at ../sysdeps/nptl/libc_start_call_main.h:58
warning: 58	../sysdeps/nptl/libc_start_call_main.h: No such file or directory
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
[Inferior 1 (process 10344) detached]
time=2025-08-05T00:08:52.415-03:00 level=INFO source=server.go:652 msg="waiting for server to become available" status="llm server loading model"
SIGABRT: abort
PC=0x7cf4cc6a49bc m=4 sigcode=18446744073709551610
signal arrived during cgo execution

goroutine 13 gp=0xc000582a80 m=4 mp=0xc00008b808 [syscall]:
runtime.cgocall(0x1148360, 0xc00059b890)
	/usr/local/go/src/runtime/cgocall.go:167 +0x4b fp=0xc00059b868 sp=0xc00059b830 pc=0x48398b
github.com/ollama/ollama/llama._Cfunc_llama_model_load_from_file(0x7cf464000d50, {0x0, 0x0, 0x3e7, 0x1, 0x0, 0x0, 0x11479d0, 0xc000352040, 0x0, ...})
	_cgo_gotypes.go:876 +0x47 fp=0xc00059b890 sp=0xc00059b868 pc=0x833dc7
github.com/ollama/ollama/llama.LoadModelFromFile.func4(...)
	/home/arda/ruonan/ollama-internal/llama/llama.go:296
github.com/ollama/ollama/llama.LoadModelFromFile({0x7ffd3dfa934f, 0x62}, {0x3e7, 0x0, 0x0, {0x0, 0x0, 0x0}, 0xc000503860, 0x0, ...})
	/home/arda/ruonan/ollama-internal/llama/llama.go:296 +0x4d7 fp=0xc00059bd80 sp=0xc00059b890 pc=0x835ff7
github.com/ollama/ollama/runner/llamarunner.(*Server).loadModel(0xc000114360, {0x3e7, 0x0, 0x0, {0x0, 0x0, 0x0}, 0xc000503860, 0x0, {0x0, ...}}, ...)
	/home/arda/ruonan/ollama-internal/runner/llamarunner/runner.go:749 +0x9e fp=0xc00059bee8 sp=0xc00059bd80 pc=0x8f2f3e
github.com/ollama/ollama/runner/llamarunner.Execute.gowrap1()
	/home/arda/ruonan/ollama-internal/runner/llamarunner/runner.go:885 +0x115 fp=0xc00059bfe0 sp=0xc00059bee8 pc=0x8f4a15
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00059bfe8 sp=0xc00059bfe0 pc=0x48eca1
created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1
	/home/arda/ruonan/ollama-internal/runner/llamarunner/runner.go:885 +0xd2a

goroutine 1 gp=0xc000002380 m=nil [IO wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc0004b55b8 sp=0xc0004b5598 pc=0x486e0e
runtime.netpollblock(0xc0004b5608?, 0x420706?, 0x0?)
	/usr/local/go/src/runtime/netpoll.go:575 +0xf7 fp=0xc0004b55f0 sp=0xc0004b55b8 pc=0x44c2d7
internal/poll.runtime_pollWait(0x7cf4ccc93eb0, 0x72)
	/usr/local/go/src/runtime/netpoll.go:351 +0x85 fp=0xc0004b5610 sp=0xc0004b55f0 pc=0x486025
internal/poll.(*pollDesc).wait(0xc000055700?, 0x900000036?, 0x0)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0004b5638 sp=0xc0004b5610 pc=0x50e347
internal/poll.(*pollDesc).waitRead(...)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0xc000055700)
	/usr/local/go/src/internal/poll/fd_unix.go:620 +0x295 fp=0xc0004b56e0 sp=0xc0004b5638 pc=0x513715
net.(*netFD).accept(0xc000055700)
	/usr/local/go/src/net/fd_unix.go:172 +0x29 fp=0xc0004b5798 sp=0xc0004b56e0 pc=0x585d89
net.(*TCPListener).accept(0xc00052ef40)
	/usr/local/go/src/net/tcpsock_posix.go:159 +0x1b fp=0xc0004b57e8 sp=0xc0004b5798 pc=0x59b6fb
net.(*TCPListener).Accept(0xc00052ef40)
	/usr/local/go/src/net/tcpsock.go:380 +0x30 fp=0xc0004b5818 sp=0xc0004b57e8 pc=0x59a5b0
net/http.(*onceCloseListener).Accept(0xc000114d80?)
	<autogenerated>:1 +0x24 fp=0xc0004b5830 sp=0xc0004b5818 pc=0x7b25a4
net/http.(*Server).Serve(0xc000207500, {0x15e56c8, 0xc00052ef40})
	/usr/local/go/src/net/http/server.go:3424 +0x30c fp=0xc0004b5960 sp=0xc0004b5830 pc=0x789dcc
github.com/ollama/ollama/runner/llamarunner.Execute({0xc000034140, 0xf, 0x10})
	/home/arda/ruonan/ollama-internal/runner/llamarunner/runner.go:912 +0x11e9 fp=0xc0004b5d08 sp=0xc0004b5960 pc=0x8f4609
github.com/ollama/ollama/runner.Execute({0xc000034130?, 0x0?, 0x0?})
	/home/arda/ruonan/ollama-internal/runner/runner.go:22 +0xd4 fp=0xc0004b5d30 sp=0xc0004b5d08 pc=0x979374
github.com/ollama/ollama/cmd.NewCLI.func2(0xc000207200?, {0x140da22?, 0x4?, 0x140da26?})
	/home/arda/ruonan/ollama-internal/cmd/cmd.go:1529 +0x45 fp=0xc0004b5d58 sp=0xc0004b5d30 pc=0x10d5b45
github.com/spf13/cobra.(*Command).execute(0xc00011af08, {0xc0004e0e10, 0xf, 0xf})
	/home/arda/go/pkg/mod/github.com/spf13/[email protected]/command.go:940 +0x894 fp=0xc0004b5e78 sp=0xc0004b5d58 pc=0x5ff694
github.com/spf13/cobra.(*Command).ExecuteC(0xc0004baf08)
	/home/arda/go/pkg/mod/github.com/spf13/[email protected]/command.go:1068 +0x3a5 fp=0xc0004b5f30 sp=0xc0004b5e78 pc=0x5ffee5
github.com/spf13/cobra.(*Command).Execute(...)
	/home/arda/go/pkg/mod/github.com/spf13/[email protected]/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
	/home/arda/go/pkg/mod/github.com/spf13/[email protected]/command.go:985
main.main()
	/home/arda/ruonan/ollama-internal/main.go:12 +0x4d fp=0xc0004b5f50 sp=0xc0004b5f30 pc=0x10d65cd
runtime.main()
	/usr/local/go/src/runtime/proc.go:283 +0x28b fp=0xc0004b5fe0 sp=0xc0004b5f50 pc=0x45390b
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0004b5fe8 sp=0xc0004b5fe0 pc=0x48eca1

goroutine 2 gp=0xc000002e00 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000084fa8 sp=0xc000084f88 pc=0x486e0e
runtime.goparkunlock(...)
	/usr/local/go/src/runtime/proc.go:441
runtime.forcegchelper()
	/usr/local/go/src/runtime/proc.go:348 +0xb3 fp=0xc000084fe0 sp=0xc000084fa8 pc=0x453c53
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000084fe8 sp=0xc000084fe0 pc=0x48eca1
created by runtime.init.7 in goroutine 1
	/usr/local/go/src/runtime/proc.go:336 +0x1a

goroutine 3 gp=0xc000003340 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000085780 sp=0xc000085760 pc=0x486e0e
runtime.goparkunlock(...)
	/usr/local/go/src/runtime/proc.go:441
runtime.bgsweep(0xc0000b0000)
	/usr/local/go/src/runtime/mgcsweep.go:316 +0xdf fp=0xc0000857c8 sp=0xc000085780 pc=0x43e45f
runtime.gcenable.gowrap1()
	/usr/local/go/src/runtime/mgc.go:204 +0x25 fp=0xc0000857e0 sp=0xc0000857c8 pc=0x4328c5
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000857e8 sp=0xc0000857e0 pc=0x48eca1
created by runtime.gcenable in goroutine 1
	/usr/local/go/src/runtime/mgc.go:204 +0x66

goroutine 4 gp=0xc000003500 m=nil [GC scavenge wait]:
runtime.gopark(0x10000?, 0x15d2cc8?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000085f78 sp=0xc000085f58 pc=0x486e0e
runtime.goparkunlock(...)
	/usr/local/go/src/runtime/proc.go:441
runtime.(*scavengerState).park(0x1ecc9a0)
	/usr/local/go/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc000085fa8 sp=0xc000085f78 pc=0x43bea9
runtime.bgscavenge(0xc0000b0000)
	/usr/local/go/src/runtime/mgcscavenge.go:658 +0x59 fp=0xc000085fc8 sp=0xc000085fa8 pc=0x43c439
runtime.gcenable.gowrap2()
	/usr/local/go/src/runtime/mgc.go:205 +0x25 fp=0xc000085fe0 sp=0xc000085fc8 pc=0x432865
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000085fe8 sp=0xc000085fe0 pc=0x48eca1
created by runtime.gcenable in goroutine 1
	/usr/local/go/src/runtime/mgc.go:205 +0xa5

goroutine 5 gp=0xc000003dc0 m=nil [finalizer wait]:
runtime.gopark(0x1b8?, 0xc000002380?, 0x1?, 0x23?, 0xc000084688?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000084630 sp=0xc000084610 pc=0x486e0e
runtime.runfinq()
	/usr/local/go/src/runtime/mfinal.go:196 +0x107 fp=0xc0000847e0 sp=0xc000084630 pc=0x431887
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000847e8 sp=0xc0000847e0 pc=0x48eca1
created by runtime.createfing in goroutine 1
	/usr/local/go/src/runtime/mfinal.go:166 +0x3d

goroutine 6 gp=0xc0001e48c0 m=nil [chan receive]:
runtime.gopark(0xc0001e1900?, 0xc000116018?, 0x60?, 0x67?, 0x56cac8?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000086718 sp=0xc0000866f8 pc=0x486e0e
runtime.chanrecv(0xc0000be310, 0x0, 0x1)
	/usr/local/go/src/runtime/chan.go:664 +0x445 fp=0xc000086790 sp=0xc000086718 pc=0x4232a5
runtime.chanrecv1(0x0?, 0x0?)
	/usr/local/go/src/runtime/chan.go:506 +0x12 fp=0xc0000867b8 sp=0xc000086790 pc=0x422e32
runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
	/usr/local/go/src/runtime/mgc.go:1796
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
	/usr/local/go/src/runtime/mgc.go:1799 +0x2f fp=0xc0000867e0 sp=0xc0000867b8 pc=0x435a0f
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000867e8 sp=0xc0000867e0 pc=0x48eca1
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1794 +0x79

goroutine 7 gp=0xc0001e4e00 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000086f38 sp=0xc000086f18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
	/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc000086fc8 sp=0xc000086f38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
	/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc000086fe0 sp=0xc000086fc8 pc=0x434c05
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000086fe8 sp=0xc000086fe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1339 +0x105

goroutine 8 gp=0xc0001e4fc0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000087738 sp=0xc000087718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
	/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc0000877c8 sp=0xc000087738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
	/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc0000877e0 sp=0xc0000877c8 pc=0x434c05
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000877e8 sp=0xc0000877e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1339 +0x105

goroutine 18 gp=0xc000102380 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000080738 sp=0xc000080718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
	/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc0000807c8 sp=0xc000080738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
	/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc0000807e0 sp=0xc0000807c8 pc=0x434c05
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000807e8 sp=0xc0000807e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1339 +0x105

goroutine 34 gp=0xc000504000 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00050a738 sp=0xc00050a718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
	/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc00050a7c8 sp=0xc00050a738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
	/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc00050a7e0 sp=0xc00050a7c8 pc=0x434c05
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00050a7e8 sp=0xc00050a7e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1339 +0x105

goroutine 35 gp=0xc0005041c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00050af38 sp=0xc00050af18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
	/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc00050afc8 sp=0xc00050af38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
	/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc00050afe0 sp=0xc00050afc8 pc=0x434c05
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00050afe8 sp=0xc00050afe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1339 +0x105

goroutine 36 gp=0xc000504380 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00050b738 sp=0xc00050b718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
	/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc00050b7c8 sp=0xc00050b738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
	/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc00050b7e0 sp=0xc00050b7c8 pc=0x434c05
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00050b7e8 sp=0xc00050b7e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1339 +0x105

goroutine 37 gp=0xc000504540 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00050bf38 sp=0xc00050bf18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
	/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc00050bfc8 sp=0xc00050bf38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
	/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc00050bfe0 sp=0xc00050bfc8 pc=0x434c05
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00050bfe8 sp=0xc00050bfe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1339 +0x105

goroutine 9 gp=0xc0001e5180 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000087f38 sp=0xc000087f18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
	/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc000087fc8 sp=0xc000087f38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
	/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc000087fe0 sp=0xc000087fc8 pc=0x434c05
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000087fe8 sp=0xc000087fe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1339 +0x105

goroutine 19 gp=0xc000102540 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000080f38 sp=0xc000080f18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
	/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc000080fc8 sp=0xc000080f38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
	/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc000080fe0 sp=0xc000080fc8 pc=0x434c05
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000080fe8 sp=0xc000080fe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1339 +0x105

goroutine 10 gp=0xc0001e5340 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000506738 sp=0xc000506718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
	/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc0005067c8 sp=0xc000506738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
	/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc0005067e0 sp=0xc0005067c8 pc=0x434c05
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0005067e8 sp=0xc0005067e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1339 +0x105

goroutine 38 gp=0xc000504700 m=nil [GC worker (idle)]:
runtime.gopark(0x9307403b5e?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00050c738 sp=0xc00050c718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
	/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc00050c7c8 sp=0xc00050c738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
	/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc00050c7e0 sp=0xc00050c7c8 pc=0x434c05
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00050c7e8 sp=0xc00050c7e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1339 +0x105

goroutine 11 gp=0xc0001e5500 m=nil [GC worker (idle)]:
runtime.gopark(0x93073f2870?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000506f38 sp=0xc000506f18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
	/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc000506fc8 sp=0xc000506f38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
	/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc000506fe0 sp=0xc000506fc8 pc=0x434c05
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000506fe8 sp=0xc000506fe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1339 +0x105

goroutine 20 gp=0xc000102700 m=nil [GC worker (idle)]:
runtime.gopark(0x93073f2667?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000081738 sp=0xc000081718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
	/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc0000817c8 sp=0xc000081738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
	/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc0000817e0 sp=0xc0000817c8 pc=0x434c05
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000817e8 sp=0xc0000817e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1339 +0x105

goroutine 39 gp=0xc0005048c0 m=nil [GC worker (idle)]:
runtime.gopark(0x1f7a980?, 0x1?, 0x4f?, 0x4f?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00050cf38 sp=0xc00050cf18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
	/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc00050cfc8 sp=0xc00050cf38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
	/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc00050cfe0 sp=0xc00050cfc8 pc=0x434c05
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00050cfe8 sp=0xc00050cfe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1339 +0x105

goroutine 40 gp=0xc000504a80 m=nil [GC worker (idle)]:
runtime.gopark(0x93073f4927?, 0x1?, 0xf3?, 0x41?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00050d738 sp=0xc00050d718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
	/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc00050d7c8 sp=0xc00050d738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
	/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc00050d7e0 sp=0xc00050d7c8 pc=0x434c05
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00050d7e8 sp=0xc00050d7e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1339 +0x105

goroutine 12 gp=0xc0001e56c0 m=nil [GC worker (idle)]:
runtime.gopark(0x93073f619d?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000507738 sp=0xc000507718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
	/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc0005077c8 sp=0xc000507738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
	/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc0005077e0 sp=0xc0005077c8 pc=0x434c05
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0005077e8 sp=0xc0005077e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/usr/local/go/src/runtime/mgc.go:1339 +0x105

goroutine 14 gp=0xc000582c40 m=nil [sync.WaitGroup.Wait]:
runtime.gopark(0x0?, 0x0?, 0x60?, 0xc0?, 0x0?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000083e20 sp=0xc000083e00 pc=0x486e0e
runtime.goparkunlock(...)
	/usr/local/go/src/runtime/proc.go:441
runtime.semacquire1(0xc000114368, 0x0, 0x1, 0x0, 0x18)
	/usr/local/go/src/runtime/sema.go:188 +0x21d fp=0xc000083e88 sp=0xc000083e20 pc=0x466dfd
sync.runtime_SemacquireWaitGroup(0x0?)
	/usr/local/go/src/runtime/sema.go:110 +0x25 fp=0xc000083ec0 sp=0xc000083e88 pc=0x488805
sync.(*WaitGroup).Wait(0x0?)
	/usr/local/go/src/sync/waitgroup.go:118 +0x48 fp=0xc000083ee8 sp=0xc000083ec0 pc=0x49a5e8
github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc000114360, {0x15e7cf0, 0xc000514eb0})
	/home/arda/ruonan/ollama-internal/runner/llamarunner/runner.go:314 +0x47 fp=0xc000083fb8 sp=0xc000083ee8 pc=0x8efc67
github.com/ollama/ollama/runner/llamarunner.Execute.gowrap2()
	/home/arda/ruonan/ollama-internal/runner/llamarunner/runner.go:892 +0x28 fp=0xc000083fe0 sp=0xc000083fb8 pc=0x8f48c8
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000083fe8 sp=0xc000083fe0 pc=0x48eca1
created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1
	/home/arda/ruonan/ollama-internal/runner/llamarunner/runner.go:892 +0xe05

goroutine 64 gp=0xc000582e00 m=nil [IO wait]:
runtime.gopark(0x511945?, 0xc000055900?, 0x40?, 0xfa?, 0xb?)
	/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00014f948 sp=0xc00014f928 pc=0x486e0e
runtime.netpollblock(0x4aa8b8?, 0x420706?, 0x0?)
	/usr/local/go/src/runtime/netpoll.go:575 +0xf7 fp=0xc00014f980 sp=0xc00014f948 pc=0x44c2d7
internal/poll.runtime_pollWait(0x7cf4ccc93a50, 0x72)
	/usr/local/go/src/runtime/netpoll.go:351 +0x85 fp=0xc00014f9a0 sp=0xc00014f980 pc=0x486025
internal/poll.(*pollDesc).wait(0xc000055900?, 0xc000148000?, 0x0)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00014f9c8 sp=0xc00014f9a0 pc=0x50e347
internal/poll.(*pollDesc).waitRead(...)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc000055900, {0xc000148000, 0x1000, 0x1000})
	/usr/local/go/src/internal/poll/fd_unix.go:165 +0x27a fp=0xc00014fa60 sp=0xc00014f9c8 pc=0x50f63a
net.(*netFD).Read(0xc000055900, {0xc000148000?, 0xc00014fad0?, 0x50e805?})
	/usr/local/go/src/net/fd_posix.go:55 +0x25 fp=0xc00014faa8 sp=0xc00014fa60 pc=0x583de5
net.(*conn).Read(0xc000088928, {0xc000148000?, 0x0?, 0x0?})
	/usr/local/go/src/net/net.go:194 +0x45 fp=0xc00014faf0 sp=0xc00014faa8 pc=0x592185
net/http.(*connReader).Read(0xc000119140, {0xc000148000, 0x1000, 0x1000})
	/usr/local/go/src/net/http/server.go:798 +0x159 fp=0xc00014fb40 sp=0xc00014faf0 pc=0x77ec79
bufio.(*Reader).fill(0xc000110660)
	/usr/local/go/src/bufio/bufio.go:113 +0x103 fp=0xc00014fb78 sp=0xc00014fb40 pc=0x5a9903
bufio.(*Reader).Peek(0xc000110660, 0x4)
	/usr/local/go/src/bufio/bufio.go:152 +0x53 fp=0xc00014fb98 sp=0xc00014fb78 pc=0x5a9a33
net/http.(*conn).serve(0xc000114d80, {0x15e7cb8, 0xc000118720})
	/usr/local/go/src/net/http/server.go:2137 +0x785 fp=0xc00014ffb8 sp=0xc00014fb98 pc=0x784a65
net/http.(*Server).Serve.gowrap3()
	/usr/local/go/src/net/http/server.go:3454 +0x28 fp=0xc00014ffe0 sp=0xc00014ffb8 pc=0x78a1c8
runtime.goexit({})
	/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00014ffe8 sp=0xc00014ffe0 pc=0x48eca1
created by net/http.(*Server).Serve in goroutine 1
	/usr/local/go/src/net/http/server.go:3454 +0x485

rax    0x0
rbx    0x286b
rcx    0x7cf4cc6a49bc
rdx    0x6
rdi    0x2868
rsi    0x286b
rbp    0x7cf46e9fb6e0
rsp    0x7cf46e9fb6a0
r8     0x0
r9     0x0
r10    0xf11ed7d
r11    0x246
r12    0x6
r13    0x1652e66
r14    0x16
r15    0xaf0000
rip    0x7cf4cc6a49bc
rflags 0x246
cs     0x33
fs     0x0
gs     0x0
time=2025-08-05T00:08:52.591-03:00 level=ERROR source=server.go:484 msg="llama runner terminated" error="exit status 2"
time=2025-08-05T00:08:52.666-03:00 level=ERROR source=sched.go:489 msg="error loading llama server" error="llama runner process has terminated: GGML_ASSERT(offset + size <= ggml_nbytes(tensor) && \"tensor write out of bounds\") failed"
[GIN] 2025/08/05 - 00:08:52 | 500 | 10.562076295s |       127.0.0.1 | POST     "/api/generate"

WizardlyBump17 avatar Aug 05 '25 03:08 WizardlyBump17

It is working on 2.2.0, so something between that and 2.3.0b20250802 happened that caused this issue.

WizardlyBump17 avatar Aug 08 '25 01:08 WizardlyBump17