Cant run qwen2.5-coder:14b-base-q2_K with Ollama
Describe the bug
When I run ./ollama run qwen2.5-coder:14b-base-q2_K with Ollama from ipex_llm[cpp]==2.3.0b20250802 I get the following error:
Error: llama runner process has terminated: GGML_ASSERT(offset + size <= ggml_nbytes(tensor) && "tensor write out of bounds") failed
How to reproduce Steps to reproduce the error:
-
pip install ipex_llm[cpp]==2.3.0b20250802 -
./ollama run qwen2.5-coder:14b-base-q2_K
Environment information
(ipex-llm) davi@davi:~/AI/ollama-ipex/latest$ bash env-check.sh
-----------------------------------------------------------------
PYTHON_VERSION=3.11.13
-----------------------------------------------------------------
transformers=4.44.2
-----------------------------------------------------------------
torch=2.2.0+cu121
-----------------------------------------------------------------
ipex-llm Version: 2.3.0b20250802
-----------------------------------------------------------------
IPEX is not installed.
-----------------------------------------------------------------
CPU Information:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 7 5700X3D 8-Core Processor
CPU family: 25
Model: 33
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 1
Stepping: 2
Frequency boost: enabled
CPU(s) scaling MHz: 61%
CPU max MHz: 4151.0000
-----------------------------------------------------------------
Total CPU Memory: 22.8791 GB
-----------------------------------------------------------------
Operating System:
Ubuntu 25.04 \n \l
-----------------------------------------------------------------
Linux davi 6.14.0-27-generic #27-Ubuntu SMP PREEMPT_DYNAMIC Tue Jul 22 17:01:58 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
-----------------------------------------------------------------
CLI:
Version: 1.2.41.20250513
Build ID: 00000000
Service:
Version: 1.2.41.20250513
Build ID: 00000000
Level Zero Version: 1.21.9
-----------------------------------------------------------------
Driver UUID 32303235-2e32-302e-362e-302e30345f32
Driver Version 2025.20.6.0.04_224945
Driver UUID 32352e32-322e-3333-3934-340000000000
Driver Version 25.22.33944
-----------------------------------------------------------------
Driver related package version:
-----------------------------------------------------------------
igpu not detected
-----------------------------------------------------------------
xpu-smi is properly installed.
-----------------------------------------------------------------
No device discovered
GPU0 Memory size=16G
-----------------------------------------------------------------
0b:00.0 VGA compatible controller: Intel Corporation Battlemage G21 [Arc B580] (prog-if 00 [VGA controller])
Subsystem: Intel Corporation Device 1100
Flags: bus master, fast devsel, latency 0, IRQ 74, IOMMU group 20
Memory at f5000000 (64-bit, non-prefetchable) [size=16M]
Memory at 7800000000 (64-bit, prefetchable) [size=16G]
Expansion ROM at f6000000 [disabled] [size=2M]
Capabilities: <access denied>
Kernel driver in use: xe
Kernel modules: xe
-----------------------------------------------------------------
Additional context
Ubuntu 25.04, kernel 6.14.0-27-generic, Arc B580, 24Gb RAM, intel-oneapi-base-toolkit-2025.0
Full log:
time=2025-08-02T17:02:53.138-03:00 level=INFO source=server.go:135 msg="system memory" total="22.9 GiB" free="15.6 GiB" free_swap="15.2 GiB"
time=2025-08-02T17:02:53.138-03:00 level=INFO source=server.go:187 msg=offload library=cpu layers.requested=-1 layers.model=49 layers.offload=0 layers.split="" memory.available="[15.6 GiB]" memory.gpu_overhead="0 B" memory.required.full="6.3 GiB" memory.required.partial="0 B" memory.required.kv="768.0 MiB" memory.required.allocations="[6.3 GiB]" memory.weights.total="5.1 GiB" memory.weights.repeating="4.5 GiB" memory.weights.nonrepeating="609.1 MiB" memory.graph.full="348.0 MiB" memory.graph.partial="916.1 MiB"
llama_model_loader: loaded meta data with 33 key-value pairs and 579 tensors from /root/.ollama/models/blobs/sha256-3d56bdc5fb9286615ef0f4ab59e1471fbe47d43d67fa7be46efb725cc9650add (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen2
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = Qwen2.5 Coder 14B
llama_model_loader: - kv 3: general.basename str = Qwen2.5-Coder
llama_model_loader: - kv 4: general.size_label str = 14B
llama_model_loader: - kv 5: general.license str = apache-2.0
llama_model_loader: - kv 6: general.license.link str = https://huggingface.co/Qwen/Qwen2.5-C...
llama_model_loader: - kv 7: general.base_model.count u32 = 1
llama_model_loader: - kv 8: general.base_model.0.name str = Qwen2.5 14B
llama_model_loader: - kv 9: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 10: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen2.5-14B
llama_model_loader: - kv 11: general.tags arr[str,5] = ["code", "qwen", "qwen-coder", "codeq...
llama_model_loader: - kv 12: general.languages arr[str,1] = ["en"]
llama_model_loader: - kv 13: qwen2.block_count u32 = 48
llama_model_loader: - kv 14: qwen2.context_length u32 = 32768
llama_model_loader: - kv 15: qwen2.embedding_length u32 = 5120
llama_model_loader: - kv 16: qwen2.feed_forward_length u32 = 13824
llama_model_loader: - kv 17: qwen2.attention.head_count u32 = 40
llama_model_loader: - kv 18: qwen2.attention.head_count_kv u32 = 8
llama_model_loader: - kv 19: qwen2.rope.freq_base f32 = 1000000.000000
llama_model_loader: - kv 20: qwen2.attention.layer_norm_rms_epsilon f32 = 0.000010
llama_model_loader: - kv 21: general.file_type u32 = 10
llama_model_loader: - kv 22: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 23: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 24: tokenizer.ggml.tokens arr[str,152064] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 25: tokenizer.ggml.token_type arr[i32,152064] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 26: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 27: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 28: tokenizer.ggml.padding_token_id u32 = 151643
llama_model_loader: - kv 29: tokenizer.ggml.bos_token_id u32 = 151643
llama_model_loader: - kv 30: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 31: tokenizer.chat_template str = {%- if tools %}\n {{- '<|im_start|>...
llama_model_loader: - kv 32: general.quantization_version u32 = 2
llama_model_loader: - type f32: 241 tensors
llama_model_loader: - type q2_K: 193 tensors
llama_model_loader: - type q3_K: 96 tensors
llama_model_loader: - type q4_K: 48 tensors
llama_model_loader: - type q6_K: 1 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q2_K - Medium
print_info: file size = 5.37 GiB (3.12 BPW)
load: special tokens cache size = 22
load: token to piece cache size = 0.9310 MB
print_info: arch = qwen2
print_info: vocab_only = 1
print_info: model type = ?B
print_info: model params = 14.77 B
print_info: general.name = Qwen2.5 Coder 14B
print_info: vocab type = BPE
print_info: n_vocab = 152064
print_info: n_merges = 151387
print_info: BOS token = 151643 '<|endoftext|>'
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151643 '<|endoftext|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
llama_model_load: vocab only - skipping tensors
time=2025-08-02T17:02:53.318-03:00 level=INFO source=server.go:458 msg="starting llama server" cmd="/home/davi/anaconda3/envs/ipex-llm/lib/python3.11/site-packages/bigdl/cpp/libs/ollama/ollama-lib runner --model /root/.ollama/models/blobs/sha256-3d56bdc5fb9286615ef0f4ab59e1471fbe47d43d67fa7be46efb725cc9650add --ctx-size 4096 --batch-size 512 --n-gpu-layers 999 --threads 8 --no-mmap --parallel 2 --port 42995"
time=2025-08-02T17:02:53.318-03:00 level=INFO source=sched.go:483 msg="loaded runners" count=1
time=2025-08-02T17:02:53.318-03:00 level=INFO source=server.go:618 msg="waiting for llama runner to start responding"
time=2025-08-02T17:02:53.319-03:00 level=INFO source=server.go:652 msg="waiting for server to become available" status="llm server not responding"
using override patterns: []
time=2025-08-02T17:02:53.364-03:00 level=INFO source=runner.go:851 msg="starting go runner"
load_backend: loaded SYCL backend from /home/davi/anaconda3/envs/ipex-llm/lib/python3.11/site-packages/bigdl/cpp/libs/ollama/libggml-sycl.so
load_backend: loaded CPU backend from /home/davi/anaconda3/envs/ipex-llm/lib/python3.11/site-packages/bigdl/cpp/libs/ollama/libggml-cpu-haswell.so
time=2025-08-02T17:02:53.425-03:00 level=INFO source=ggml.go:104 msg=system CPU.0.LLAMAFILE=1 CPU.0.OPENMP=1 CPU.0.AARCH64_REPACK=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc)
time=2025-08-02T17:02:53.425-03:00 level=INFO source=runner.go:911 msg="Server listening on 127.0.0.1:42995"
llama_model_load_from_file_impl: using device SYCL0 (Intel(R) Arc(TM) B580 Graphics) - 11345 MiB free
llama_model_loader: loaded meta data with 33 key-value pairs and 579 tensors from /root/.ollama/models/blobs/sha256-3d56bdc5fb9286615ef0f4ab59e1471fbe47d43d67fa7be46efb725cc9650add (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen2
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = Qwen2.5 Coder 14B
llama_model_loader: - kv 3: general.basename str = Qwen2.5-Coder
llama_model_loader: - kv 4: general.size_label str = 14B
llama_model_loader: - kv 5: general.license str = apache-2.0
llama_model_loader: - kv 6: general.license.link str = https://huggingface.co/Qwen/Qwen2.5-C...
llama_model_loader: - kv 7: general.base_model.count u32 = 1
llama_model_loader: - kv 8: general.base_model.0.name str = Qwen2.5 14B
llama_model_loader: - kv 9: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 10: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen2.5-14B
llama_model_loader: - kv 11: general.tags arr[str,5] = ["code", "qwen", "qwen-coder", "codeq...
llama_model_loader: - kv 12: general.languages arr[str,1] = ["en"]
llama_model_loader: - kv 13: qwen2.block_count u32 = 48
llama_model_loader: - kv 14: qwen2.context_length u32 = 32768
llama_model_loader: - kv 15: qwen2.embedding_length u32 = 5120
llama_model_loader: - kv 16: qwen2.feed_forward_length u32 = 13824
llama_model_loader: - kv 17: qwen2.attention.head_count u32 = 40
llama_model_loader: - kv 18: qwen2.attention.head_count_kv u32 = 8
llama_model_loader: - kv 19: qwen2.rope.freq_base f32 = 1000000.000000
llama_model_loader: - kv 20: qwen2.attention.layer_norm_rms_epsilon f32 = 0.000010
llama_model_loader: - kv 21: general.file_type u32 = 10
llama_model_loader: - kv 22: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 23: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 24: tokenizer.ggml.tokens arr[str,152064] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 25: tokenizer.ggml.token_type arr[i32,152064] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 26: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 27: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 28: tokenizer.ggml.padding_token_id u32 = 151643
llama_model_loader: - kv 29: tokenizer.ggml.bos_token_id u32 = 151643
llama_model_loader: - kv 30: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 31: tokenizer.chat_template str = {%- if tools %}\n {{- '<|im_start|>...
llama_model_loader: - kv 32: general.quantization_version u32 = 2
llama_model_loader: - type f32: 241 tensors
llama_model_loader: - type q2_K: 193 tensors
llama_model_loader: - type q3_K: 96 tensors
llama_model_loader: - type q4_K: 48 tensors
llama_model_loader: - type q6_K: 1 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q2_K - Medium
print_info: file size = 5.37 GiB (3.12 BPW)
time=2025-08-02T17:02:53.570-03:00 level=INFO source=server.go:652 msg="waiting for server to become available" status="llm server loading model"
load: special tokens cache size = 22
load: token to piece cache size = 0.9310 MB
print_info: arch = qwen2
print_info: vocab_only = 0
print_info: n_ctx_train = 32768
print_info: n_embd = 5120
print_info: n_layer = 48
print_info: n_head = 40
print_info: n_head_kv = 8
print_info: n_rot = 128
print_info: n_swa = 0
print_info: n_swa_pattern = 1
print_info: n_embd_head_k = 128
print_info: n_embd_head_v = 128
print_info: n_gqa = 5
print_info: n_embd_k_gqa = 1024
print_info: n_embd_v_gqa = 1024
print_info: f_norm_eps = 0.0e+00
print_info: f_norm_rms_eps = 1.0e-05
print_info: f_clamp_kqv = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale = 0.0e+00
print_info: f_attn_scale = 0.0e+00
print_info: n_ff = 13824
print_info: n_expert = 0
print_info: n_expert_used = 0
print_info: causal attn = 1
print_info: pooling type = -1
print_info: rope type = 2
print_info: rope scaling = linear
print_info: freq_base_train = 1000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn = 32768
print_info: rope_finetuned = unknown
print_info: ssm_d_conv = 0
print_info: ssm_d_inner = 0
print_info: ssm_d_state = 0
print_info: ssm_dt_rank = 0
print_info: ssm_dt_b_c_rms = 0
print_info: model type = 14B
print_info: model params = 14.77 B
print_info: general.name = Qwen2.5 Coder 14B
print_info: vocab type = BPE
print_info: n_vocab = 152064
print_info: n_merges = 151387
print_info: BOS token = 151643 '<|endoftext|>'
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151643 '<|endoftext|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false)
load_tensors: offloading 48 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 49/49 layers to GPU
load_tensors: CPU model buffer size = 243.63 MiB
load_tensors: SYCL0 model buffer size = 5288.54 MiB
ggml-backend.cpp:265: GGML_ASSERT(offset + size <= ggml_nbytes(tensor) && "tensor write out of bounds") failed
[New LWP 22089]
[New LWP 22088]
[New LWP 22080]
[New LWP 22079]
[New LWP 22078]
[New LWP 22077]
[New LWP 22076]
[New LWP 22075]
[New LWP 22074]
[New LWP 22073]
[New LWP 22072]
[New LWP 22071]
warning: Missing auto-load script at offset 0 in section .debug_gdb_scripts
of file /home/davi/anaconda3/envs/ipex-llm/lib/python3.11/site-packages/bigdl/cpp/libs/ollama/ollama-lib.
Use `info auto-load python-scripts [REGEXP]' to list them.
Function(s) ^std::(move|forward|as_const|(__)?addressof) will be skipped when stepping.
Function(s) ^std::(shared|unique)_ptr<.*>::(get|operator) will be skipped when stepping.
Function(s) ^std::(basic_string|vector|array|deque|(forward_)?list|(unordered_|flat_)?(multi)?(map|set)|span)<.*>::(c?r?(begin|end)|front|back|data|size|empty) will be skipped when stepping.
Function(s) ^std::(basic_string|vector|array|deque|span)<.*>::operator.] will be skipped when stepping.
time=2025-08-02T17:03:02.039-03:00 level=INFO source=server.go:652 msg="waiting for server to become available" status="llm server not responding"
warning: File "/opt/intel/oneapi/compiler/2025.2/lib/libsycl.so.8.0.0-gdb.py" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
add-auto-load-safe-path /opt/intel/oneapi/compiler/2025.2/lib/libsycl.so.8.0.0-gdb.py
line to your configuration file "/root/.config/gdb/gdbinit".
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file "/root/.config/gdb/gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual. E.g., run from the shell:
info "(gdb)Auto-loading safe path"
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
runtime.futex () at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/sys_linux_amd64.s:558
warning: 558 /root/go/pkg/mod/golang.org/[email protected]/src/runtime/sys_linux_amd64.s: No such file or directory
#0 runtime.futex () at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/sys_linux_amd64.s:558
558 in /root/go/pkg/mod/golang.org/[email protected]/src/runtime/sys_linux_amd64.s
#1 0x0000000000460670 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=4866275) at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/os_linux.go:75
warning: 75 /root/go/pkg/mod/golang.org/[email protected]/src/runtime/os_linux.go: No such file or directory
#2 0x000000000043c707 in runtime.notesleep (n=0x20e58a0 <runtime.m0+320>) at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/lock_futex.go:47
warning: 47 /root/go/pkg/mod/golang.org/[email protected]/src/runtime/lock_futex.go: No such file or directory
#3 0x000000000046be2c in runtime.mPark () at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:1887
warning: 1887 /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go: No such file or directory
#4 runtime.stopm () at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:2907
2907 in /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go
#5 0x000000000046d8fc in runtime.findRunnable (gp=<optimized out>, inheritTime=<optimized out>, tryWakeP=<optimized out>) at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:3644
3644 in /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go
#6 0x000000000046e9f1 in runtime.schedule () at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:4017
4017 in /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go
#7 0x000000000046eea5 in runtime.park_m (gp=0xc000103dc0) at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:4141
4141 in /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go
#8 0x00000000004a02ae in runtime.mcall () at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:459
warning: 459 /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s: No such file or directory
#9 0x00007ffe75607f68 in ?? ()
#10 0x00000000004a4cbf in runtime.newproc (fn=0x4a01af <runtime.rt0_go+303>) at <autogenerated>:1
warning: 1 <autogenerated>: No such file or directory
#11 0x00000000004a0225 in runtime.mstart () at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:395
warning: 395 /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s: No such file or directory
#12 0x00000000004a01af in runtime.rt0_go () at /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:358
358 in /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s
#13 0x0000000000000011 in ?? ()
#14 0x00007ffe756080c8 in ?? ()
#15 0x0000000000000009 in ?? ()
#16 0x0000000000000011 in ?? ()
#17 0x00007ffe756080c8 in ?? ()
#18 0x00007804e6e2a578 in __libc_start_call_main (main=0x0, argc=0, argv=0x0) at ../sysdeps/nptl/libc_start_call_main.h:58
warning: 58 ../sysdeps/nptl/libc_start_call_main.h: No such file or directory
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
[Inferior 1 (process 22070) detached]
SIGABRT: abort
PC=0x7804e6ea49bc m=9 sigcode=18446744073709551610
signal arrived during cgo execution
goroutine 50 gp=0xc000102fc0 m=9 mp=0xc000580008 [syscall]:
runtime.cgocall(0x1159a80, 0xc0003b7890)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/cgocall.go:167 +0x4b fp=0xc0003b7868 sp=0xc0003b7830 pc=0x49780b
github.com/ollama/ollama/llama._Cfunc_llama_model_load_from_file(0x780460000b70, {0x0, 0x0, 0x3e7, 0x1, 0x0, 0x0, 0x1159230, 0xc000591028, 0x0, ...})
_cgo_gotypes.go:876 +0x47 fp=0xc0003b7890 sp=0xc0003b7868 pc=0x846047
github.com/ollama/ollama/llama.LoadModelFromFile.func4(...)
/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/llama/llama.go:296
github.com/ollama/ollama/llama.LoadModelFromFile({0x7ffe7560909c, 0x62}, {0x3e7, 0x0, 0x0, {0x0, 0x0, 0x0}, 0xc0005961d0, 0x0, ...})
/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/llama/llama.go:296 +0x4d7 fp=0xc0003b7d80 sp=0xc0003b7890 pc=0x848277
github.com/ollama/ollama/runner/llamarunner.(*Server).loadModel(0xc000326000, {0x3e7, 0x0, 0x0, {0x0, 0x0, 0x0}, 0xc0005961d0, 0x0, {0x0, ...}}, ...)
/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/runner/llamarunner/runner.go:749 +0x9e fp=0xc0003b7ee8 sp=0xc0003b7d80 pc=0x90503e
github.com/ollama/ollama/runner/llamarunner.Execute.gowrap1()
/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/runner/llamarunner/runner.go:885 +0x115 fp=0xc0003b7fe0 sp=0xc0003b7ee8 pc=0x906b15
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0003b7fe8 sp=0xc0003b7fe0 pc=0x4a22e1
created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1
/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/runner/llamarunner/runner.go:885 +0xd2a
goroutine 1 gp=0xc000002380 m=nil [IO wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc00005d5b8 sp=0xc00005d598 pc=0x49ac8e
runtime.netpollblock(0xc00005d608?, 0x433da6?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/netpoll.go:575 +0xf7 fp=0xc00005d5f0 sp=0xc00005d5b8 pc=0x45f957
internal/poll.runtime_pollWait(0x7804e7cc8de0, 0x72)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/netpoll.go:351 +0x85 fp=0xc00005d610 sp=0xc00005d5f0 pc=0x499ea5
internal/poll.(*pollDesc).wait(0xc0005be180?, 0x900000036?, 0x0)
/root/go/pkg/mod/golang.org/[email protected]/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00005d638 sp=0xc00005d610 pc=0x5211c7
internal/poll.(*pollDesc).waitRead(...)
/root/go/pkg/mod/golang.org/[email protected]/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0xc0005be180)
/root/go/pkg/mod/golang.org/[email protected]/src/internal/poll/fd_unix.go:620 +0x295 fp=0xc00005d6e0 sp=0xc00005d638 pc=0x526595
net.(*netFD).accept(0xc0005be180)
/root/go/pkg/mod/golang.org/[email protected]/src/net/fd_unix.go:172 +0x29 fp=0xc00005d798 sp=0xc00005d6e0 pc=0x598aa9
net.(*TCPListener).accept(0xc00059a180)
/root/go/pkg/mod/golang.org/[email protected]/src/net/tcpsock_posix.go:159 +0x1b fp=0xc00005d7e8 sp=0xc00005d798 pc=0x5ae41b
net.(*TCPListener).Accept(0xc00059a180)
/root/go/pkg/mod/golang.org/[email protected]/src/net/tcpsock.go:380 +0x30 fp=0xc00005d818 sp=0xc00005d7e8 pc=0x5ad2d0
net/http.(*onceCloseListener).Accept(0xc00017acf0?)
<autogenerated>:1 +0x24 fp=0xc00005d830 sp=0xc00005d818 pc=0x7c4964
net/http.(*Server).Serve(0xc0001f6600, {0x15f1be8, 0xc00059a180})
/root/go/pkg/mod/golang.org/[email protected]/src/net/http/server.go:3424 +0x30c fp=0xc00005d960 sp=0xc00005d830 pc=0x79c22c
github.com/ollama/ollama/runner/llamarunner.Execute({0xc000134020, 0xf, 0x10})
/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/runner/llamarunner/runner.go:912 +0x11e9 fp=0xc00005dd08 sp=0xc00005d960 pc=0x906709
github.com/ollama/ollama/runner.Execute({0xc000134010?, 0x0?, 0x0?})
/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/runner/runner.go:22 +0xd4 fp=0xc00005dd30 sp=0xc00005dd08 pc=0x98b474
github.com/ollama/ollama/cmd.NewCLI.func2(0xc0000d0f00?, {0x141a6a2?, 0x4?, 0x141a6a6?})
/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/cmd/cmd.go:1529 +0x45 fp=0xc00005dd58 sp=0xc00005dd30 pc=0x10e7c05
github.com/spf13/cobra.(*Command).execute(0xc000230f08, {0xc0000d4ff0, 0xf, 0xf})
/root/go/pkg/mod/github.com/spf13/[email protected]/command.go:940 +0x85c fp=0xc00005de78 sp=0xc00005dd58 pc=0x6120bc
github.com/spf13/cobra.(*Command).ExecuteC(0xc0000fe908)
/root/go/pkg/mod/github.com/spf13/[email protected]/command.go:1068 +0x3a5 fp=0xc00005df30 sp=0xc00005de78 pc=0x612905
github.com/spf13/cobra.(*Command).Execute(...)
/root/go/pkg/mod/github.com/spf13/[email protected]/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
/root/go/pkg/mod/github.com/spf13/[email protected]/command.go:985
main.main()
/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/main.go:12 +0x4d fp=0xc00005df50 sp=0xc00005df30 pc=0x10e868d
runtime.main()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:283 +0x28b fp=0xc00005dfe0 sp=0xc00005df50 pc=0x466f6b
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00005dfe8 sp=0xc00005dfe0 pc=0x4a22e1
goroutine 2 gp=0xc000002e00 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000094fa8 sp=0xc000094f88 pc=0x49ac8e
runtime.goparkunlock(...)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:441
runtime.forcegchelper()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:348 +0xb3 fp=0xc000094fe0 sp=0xc000094fa8 pc=0x4672b3
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000094fe8 sp=0xc000094fe0 pc=0x4a22e1
created by runtime.init.7 in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:336 +0x1a
goroutine 3 gp=0xc000003340 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000095780 sp=0xc000095760 pc=0x49ac8e
runtime.goparkunlock(...)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:441
runtime.bgsweep(0xc0000c0000)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgcsweep.go:316 +0xdf fp=0xc0000957c8 sp=0xc000095780 pc=0x451adf
runtime.gcenable.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:204 +0x25 fp=0xc0000957e0 sp=0xc0000957c8 pc=0x445f45
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000957e8 sp=0xc0000957e0 pc=0x4a22e1
created by runtime.gcenable in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:204 +0x66
goroutine 4 gp=0xc000003500 m=nil [GC scavenge wait]:
runtime.gopark(0x10000?, 0x15df218?, 0x0?, 0x0?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000095f78 sp=0xc000095f58 pc=0x49ac8e
runtime.goparkunlock(...)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:441
runtime.(*scavengerState).park(0x20e2940)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc000095fa8 sp=0xc000095f78 pc=0x44f529
runtime.bgscavenge(0xc0000c0000)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgcscavenge.go:658 +0x59 fp=0xc000095fc8 sp=0xc000095fa8 pc=0x44fab9
runtime.gcenable.gowrap2()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:205 +0x25 fp=0xc000095fe0 sp=0xc000095fc8 pc=0x445ee5
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000095fe8 sp=0xc000095fe0 pc=0x4a22e1
created by runtime.gcenable in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:205 +0xa5
goroutine 18 gp=0xc000102700 m=nil [finalizer wait]:
runtime.gopark(0x1b8?, 0xc000002380?, 0x1?, 0x23?, 0xc000094688?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000094630 sp=0xc000094610 pc=0x49ac8e
runtime.runfinq()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mfinal.go:196 +0x107 fp=0xc0000947e0 sp=0xc000094630 pc=0x444f07
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000947e8 sp=0xc0000947e0 pc=0x4a22e1
created by runtime.createfing in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mfinal.go:166 +0x3d
goroutine 19 gp=0xc000103180 m=nil [chan receive]:
runtime.gopark(0xc00022d4a0?, 0xc000490048?, 0x60?, 0x7?, 0x57f7e8?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000090718 sp=0xc0000906f8 pc=0x49ac8e
runtime.chanrecv(0xc000110380, 0x0, 0x1)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/chan.go:664 +0x445 fp=0xc000090790 sp=0xc000090718 pc=0x436925
runtime.chanrecv1(0x0?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/chan.go:506 +0x12 fp=0xc0000907b8 sp=0xc000090790 pc=0x4364b2
runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1796
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1799 +0x2f fp=0xc0000907e0 sp=0xc0000907b8 pc=0x44908f
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000907e8 sp=0xc0000907e0 pc=0x4a22e1
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1794 +0x79
goroutine 20 gp=0xc000103500 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000090f38 sp=0xc000090f18 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc000090fc8 sp=0xc000090f38 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc000090fe0 sp=0xc000090fc8 pc=0x448285
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000090fe8 sp=0xc000090fe0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105
goroutine 34 gp=0xc000484000 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc00048a738 sp=0xc00048a718 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc00048a7c8 sp=0xc00048a738 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc00048a7e0 sp=0xc00048a7c8 pc=0x448285
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00048a7e8 sp=0xc00048a7e0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105
goroutine 5 gp=0xc000003a40 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000096738 sp=0xc000096718 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc0000967c8 sp=0xc000096738 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc0000967e0 sp=0xc0000967c8 pc=0x448285
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000967e8 sp=0xc0000967e0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105
goroutine 6 gp=0xc000003c00 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000096f38 sp=0xc000096f18 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc000096fc8 sp=0xc000096f38 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc000096fe0 sp=0xc000096fc8 pc=0x448285
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000096fe8 sp=0xc000096fe0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105
goroutine 21 gp=0xc0001036c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000091738 sp=0xc000091718 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc0000917c8 sp=0xc000091738 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc0000917e0 sp=0xc0000917c8 pc=0x448285
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000917e8 sp=0xc0000917e0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105
goroutine 35 gp=0xc0004841c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc00048af38 sp=0xc00048af18 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc00048afc8 sp=0xc00048af38 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc00048afe0 sp=0xc00048afc8 pc=0x448285
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00048afe8 sp=0xc00048afe0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105
goroutine 7 gp=0xc000003dc0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000097738 sp=0xc000097718 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc0000977c8 sp=0xc000097738 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc0000977e0 sp=0xc0000977c8 pc=0x448285
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000977e8 sp=0xc0000977e0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105
goroutine 22 gp=0xc000103880 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000091f38 sp=0xc000091f18 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc000091fc8 sp=0xc000091f38 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc000091fe0 sp=0xc000091fc8 pc=0x448285
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000091fe8 sp=0xc000091fe0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105
goroutine 36 gp=0xc000484380 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc00048b738 sp=0xc00048b718 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc00048b7c8 sp=0xc00048b738 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc00048b7e0 sp=0xc00048b7c8 pc=0x448285
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00048b7e8 sp=0xc00048b7e0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105
goroutine 8 gp=0xc0000ce000 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000097f38 sp=0xc000097f18 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc000097fc8 sp=0xc000097f38 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc000097fe0 sp=0xc000097fc8 pc=0x448285
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000097fe8 sp=0xc000097fe0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105
goroutine 9 gp=0xc0000ce1c0 m=nil [GC worker (idle)]:
runtime.gopark(0x2190920?, 0x1?, 0xc9?, 0xa8?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000486738 sp=0xc000486718 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc0004867c8 sp=0xc000486738 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc0004867e0 sp=0xc0004867c8 pc=0x448285
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0004867e8 sp=0xc0004867e0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105
goroutine 10 gp=0xc0000ce380 m=nil [GC worker (idle)]:
runtime.gopark(0xb1b4666682e?, 0x1?, 0xf?, 0x26?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000486f38 sp=0xc000486f18 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc000486fc8 sp=0xc000486f38 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc000486fe0 sp=0xc000486fc8 pc=0x448285
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000486fe8 sp=0xc000486fe0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105
goroutine 11 gp=0xc0000ce540 m=nil [GC worker (idle)]:
runtime.gopark(0xb1b46671783?, 0x3?, 0xcb?, 0x6e?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000487738 sp=0xc000487718 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc0004877c8 sp=0xc000487738 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc0004877e0 sp=0xc0004877c8 pc=0x448285
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0004877e8 sp=0xc0004877e0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105
goroutine 12 gp=0xc0000ce700 m=nil [GC worker (idle)]:
runtime.gopark(0x2190920?, 0x1?, 0x19?, 0x8c?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000487f38 sp=0xc000487f18 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc000487fc8 sp=0xc000487f38 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc000487fe0 sp=0xc000487fc8 pc=0x448285
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000487fe8 sp=0xc000487fe0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105
goroutine 13 gp=0xc0000ce8c0 m=nil [GC worker (idle)]:
runtime.gopark(0x2190920?, 0x1?, 0x52?, 0xfd?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc000488738 sp=0xc000488718 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc0004887c8 sp=0xc000488738 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc0004887e0 sp=0xc0004887c8 pc=0x448285
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0004887e8 sp=0xc0004887e0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105
goroutine 37 gp=0xc000484540 m=nil [GC worker (idle)]:
runtime.gopark(0xb1b46672124?, 0x1?, 0x57?, 0xfc?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc00048bf38 sp=0xc00048bf18 pc=0x49ac8e
runtime.gcBgMarkWorker(0xc0001115e0)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1423 +0xe9 fp=0xc00048bfc8 sp=0xc00048bf38 pc=0x4483a9
runtime.gcBgMarkStartWorkers.gowrap1()
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x25 fp=0xc00048bfe0 sp=0xc00048bfc8 pc=0x448285
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00048bfe8 sp=0xc00048bfe0 pc=0x4a22e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:1339 +0x105
goroutine 51 gp=0xc000103340 m=nil [sync.WaitGroup.Wait]:
runtime.gopark(0x0?, 0x0?, 0x20?, 0x3f?, 0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:435 +0xce fp=0xc00048ce20 sp=0xc00048ce00 pc=0x49ac8e
runtime.goparkunlock(...)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:441
runtime.semacquire1(0xc000326008, 0x0, 0x1, 0x0, 0x18)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/sema.go:188 +0x21d fp=0xc00048ce88 sp=0xc00048ce20 pc=0x47a47d
sync.runtime_SemacquireWaitGroup(0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/sema.go:110 +0x25 fp=0xc00048cec0 sp=0xc00048ce88 pc=0x49c685
sync.(*WaitGroup).Wait(0x0?)
/root/go/pkg/mod/golang.org/[email protected]/src/sync/waitgroup.go:118 +0x48 fp=0xc00048cee8 sp=0xc00048cec0 pc=0x4adc28
github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc000326000, {0x15f4210, 0xc00017ceb0})
/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/runner/llamarunner/runner.go:314 +0x47 fp=0xc00048cfb8 sp=0xc00048cee8 pc=0x901d67
github.com/ollama/ollama/runner/llamarunner.Execute.gowrap2()
/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/runner/llamarunner/runner.go:892 +0x28 fp=0xc00048cfe0 sp=0xc00048cfb8 pc=0x9069c8
runtime.goexit({})
/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00048cfe8 sp=0xc00048cfe0 pc=0x4a22e1
created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1
/home/runner/_work/llm.cpp/llm.cpp/ollama-internal/runner/llamarunner/runner.go:892 +0xe05
rax 0x0
rbx 0x563e
rcx 0x7804e6ea49bc
rdx 0x6
rdi 0x5636
rsi 0x563e
rbp 0x78047a7fa6e0
rsp 0x78047a7fa6a0
r8 0x0
r9 0x0
r10 0xf11ed7d
r11 0x246
r12 0x6
r13 0x109
r14 0x16
r15 0xaf0000
rip 0x7804e6ea49bc
rflags 0x246
cs 0x33
fs 0x0
gs 0x0
time=2025-08-02T17:03:03.596-03:00 level=ERROR source=server.go:484 msg="llama runner terminated" error="exit status 2"
time=2025-08-02T17:03:03.643-03:00 level=ERROR source=sched.go:489 msg="error loading llama server" error="llama runner process has terminated: GGML_ASSERT(offset + size <= ggml_nbytes(tensor) && \"tensor write out of bounds\") failed"
[GIN] 2025/08/02 - 17:03:03 | 500 | 10.568882015s | 127.0.0.1 | POST "/api/generate"
I talked to the developing team ,they just released https://github.com/ipex-llm/ipex-llm/releases/download/v2.3.0-nightly/ollama-ipex-llm-2.3.0b20250725-win.zip , try to see if it works?
Could you pls install https://github.com/ipex-llm/ipex-llm/releases/download/v2.3.0-nightly/ollama-ipex-llm-2.3.0b20250725-win.zip and try to see if it works?
Looks like it is the same error.
time=2025-08-05T00:08:42.168-03:00 level=INFO source=server.go:135 msg="system memory" total="22.9 GiB" free="16.9 GiB" free_swap="16.0 GiB"
time=2025-08-05T00:08:42.169-03:00 level=INFO source=server.go:187 msg=offload library=cpu layers.requested=-1 layers.model=49 layers.offload=0 layers.split="" memory.available="[16.9 GiB]" memory.gpu_overhead="0 B" memory.required.full="6.3 GiB" memory.required.partial="0 B" memory.required.kv="768.0 MiB" memory.required.allocations="[6.3 GiB]" memory.weights.total="5.1 GiB" memory.weights.repeating="4.5 GiB" memory.weights.nonrepeating="609.1 MiB" memory.graph.full="348.0 MiB" memory.graph.partial="916.1 MiB"
llama_model_loader: loaded meta data with 33 key-value pairs and 579 tensors from /root/.ollama/models/blobs/sha256-3d56bdc5fb9286615ef0f4ab59e1471fbe47d43d67fa7be46efb725cc9650add (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen2
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = Qwen2.5 Coder 14B
llama_model_loader: - kv 3: general.basename str = Qwen2.5-Coder
llama_model_loader: - kv 4: general.size_label str = 14B
llama_model_loader: - kv 5: general.license str = apache-2.0
llama_model_loader: - kv 6: general.license.link str = https://huggingface.co/Qwen/Qwen2.5-C...
llama_model_loader: - kv 7: general.base_model.count u32 = 1
llama_model_loader: - kv 8: general.base_model.0.name str = Qwen2.5 14B
llama_model_loader: - kv 9: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 10: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen2.5-14B
llama_model_loader: - kv 11: general.tags arr[str,5] = ["code", "qwen", "qwen-coder", "codeq...
llama_model_loader: - kv 12: general.languages arr[str,1] = ["en"]
llama_model_loader: - kv 13: qwen2.block_count u32 = 48
llama_model_loader: - kv 14: qwen2.context_length u32 = 32768
llama_model_loader: - kv 15: qwen2.embedding_length u32 = 5120
llama_model_loader: - kv 16: qwen2.feed_forward_length u32 = 13824
llama_model_loader: - kv 17: qwen2.attention.head_count u32 = 40
llama_model_loader: - kv 18: qwen2.attention.head_count_kv u32 = 8
llama_model_loader: - kv 19: qwen2.rope.freq_base f32 = 1000000.000000
llama_model_loader: - kv 20: qwen2.attention.layer_norm_rms_epsilon f32 = 0.000010
llama_model_loader: - kv 21: general.file_type u32 = 10
llama_model_loader: - kv 22: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 23: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 24: tokenizer.ggml.tokens arr[str,152064] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 25: tokenizer.ggml.token_type arr[i32,152064] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 26: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 27: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 28: tokenizer.ggml.padding_token_id u32 = 151643
llama_model_loader: - kv 29: tokenizer.ggml.bos_token_id u32 = 151643
llama_model_loader: - kv 30: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 31: tokenizer.chat_template str = {%- if tools %}\n {{- '<|im_start|>...
llama_model_loader: - kv 32: general.quantization_version u32 = 2
llama_model_loader: - type f32: 241 tensors
llama_model_loader: - type q2_K: 193 tensors
llama_model_loader: - type q3_K: 96 tensors
llama_model_loader: - type q4_K: 48 tensors
llama_model_loader: - type q6_K: 1 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q2_K - Medium
print_info: file size = 5.37 GiB (3.12 BPW)
load: special tokens cache size = 22
load: token to piece cache size = 0.9310 MB
print_info: arch = qwen2
print_info: vocab_only = 1
print_info: model type = ?B
print_info: model params = 14.77 B
print_info: general.name = Qwen2.5 Coder 14B
print_info: vocab type = BPE
print_info: n_vocab = 152064
print_info: n_merges = 151387
print_info: BOS token = 151643 '<|endoftext|>'
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151643 '<|endoftext|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
llama_model_load: vocab only - skipping tensors
time=2025-08-05T00:08:42.350-03:00 level=INFO source=server.go:458 msg="starting llama server" cmd="/home/davi/AI/ollama-ipex/ollama-ipex-llm-2.3.0b20250725-ubuntu/ollama-bin runner --model /root/.ollama/models/blobs/sha256-3d56bdc5fb9286615ef0f4ab59e1471fbe47d43d67fa7be46efb725cc9650add --ctx-size 4096 --batch-size 512 --n-gpu-layers 999 --threads 8 --no-mmap --parallel 2 --port 38897"
time=2025-08-05T00:08:42.350-03:00 level=INFO source=sched.go:483 msg="loaded runners" count=1
time=2025-08-05T00:08:42.350-03:00 level=INFO source=server.go:618 msg="waiting for llama runner to start responding"
time=2025-08-05T00:08:42.351-03:00 level=INFO source=server.go:652 msg="waiting for server to become available" status="llm server not responding"
using override patterns: []
time=2025-08-05T00:08:42.397-03:00 level=INFO source=runner.go:851 msg="starting go runner"
load_backend: loaded SYCL backend from /home/davi/AI/ollama-ipex/ollama-ipex-llm-2.3.0b20250725-ubuntu/libggml-sycl.so
load_backend: loaded CPU backend from /home/davi/AI/ollama-ipex/ollama-ipex-llm-2.3.0b20250725-ubuntu/libggml-cpu-haswell.so
time=2025-08-05T00:08:42.446-03:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.0.OPENMP=1 CPU.0.AARCH64_REPACK=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc)
time=2025-08-05T00:08:42.446-03:00 level=INFO source=runner.go:911 msg="Server listening on 127.0.0.1:38897"
llama_model_load_from_file_impl: using device SYCL0 (Intel(R) Arc(TM) B580 Graphics) - 11241 MiB free
llama_model_loader: loaded meta data with 33 key-value pairs and 579 tensors from /root/.ollama/models/blobs/sha256-3d56bdc5fb9286615ef0f4ab59e1471fbe47d43d67fa7be46efb725cc9650add (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen2
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = Qwen2.5 Coder 14B
llama_model_loader: - kv 3: general.basename str = Qwen2.5-Coder
llama_model_loader: - kv 4: general.size_label str = 14B
llama_model_loader: - kv 5: general.license str = apache-2.0
llama_model_loader: - kv 6: general.license.link str = https://huggingface.co/Qwen/Qwen2.5-C...
llama_model_loader: - kv 7: general.base_model.count u32 = 1
llama_model_loader: - kv 8: general.base_model.0.name str = Qwen2.5 14B
llama_model_loader: - kv 9: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 10: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen2.5-14B
llama_model_loader: - kv 11: general.tags arr[str,5] = ["code", "qwen", "qwen-coder", "codeq...
llama_model_loader: - kv 12: general.languages arr[str,1] = ["en"]
llama_model_loader: - kv 13: qwen2.block_count u32 = 48
llama_model_loader: - kv 14: qwen2.context_length u32 = 32768
llama_model_loader: - kv 15: qwen2.embedding_length u32 = 5120
llama_model_loader: - kv 16: qwen2.feed_forward_length u32 = 13824
llama_model_loader: - kv 17: qwen2.attention.head_count u32 = 40
llama_model_loader: - kv 18: qwen2.attention.head_count_kv u32 = 8
llama_model_loader: - kv 19: qwen2.rope.freq_base f32 = 1000000.000000
llama_model_loader: - kv 20: qwen2.attention.layer_norm_rms_epsilon f32 = 0.000010
llama_model_loader: - kv 21: general.file_type u32 = 10
llama_model_loader: - kv 22: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 23: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 24: tokenizer.ggml.tokens arr[str,152064] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 25: tokenizer.ggml.token_type arr[i32,152064] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 26: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 27: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 28: tokenizer.ggml.padding_token_id u32 = 151643
llama_model_loader: - kv 29: tokenizer.ggml.bos_token_id u32 = 151643
llama_model_loader: - kv 30: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 31: tokenizer.chat_template str = {%- if tools %}\n {{- '<|im_start|>...
llama_model_loader: - kv 32: general.quantization_version u32 = 2
llama_model_loader: - type f32: 241 tensors
llama_model_loader: - type q2_K: 193 tensors
llama_model_loader: - type q3_K: 96 tensors
llama_model_loader: - type q4_K: 48 tensors
llama_model_loader: - type q6_K: 1 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q2_K - Medium
print_info: file size = 5.37 GiB (3.12 BPW)
time=2025-08-05T00:08:42.602-03:00 level=INFO source=server.go:652 msg="waiting for server to become available" status="llm server loading model"
load: special tokens cache size = 22
load: token to piece cache size = 0.9310 MB
print_info: arch = qwen2
print_info: vocab_only = 0
print_info: n_ctx_train = 32768
print_info: n_embd = 5120
print_info: n_layer = 48
print_info: n_head = 40
print_info: n_head_kv = 8
print_info: n_rot = 128
print_info: n_swa = 0
print_info: n_swa_pattern = 1
print_info: n_embd_head_k = 128
print_info: n_embd_head_v = 128
print_info: n_gqa = 5
print_info: n_embd_k_gqa = 1024
print_info: n_embd_v_gqa = 1024
print_info: f_norm_eps = 0.0e+00
print_info: f_norm_rms_eps = 1.0e-05
print_info: f_clamp_kqv = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale = 0.0e+00
print_info: f_attn_scale = 0.0e+00
print_info: n_ff = 13824
print_info: n_expert = 0
print_info: n_expert_used = 0
print_info: causal attn = 1
print_info: pooling type = -1
print_info: rope type = 2
print_info: rope scaling = linear
print_info: freq_base_train = 1000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn = 32768
print_info: rope_finetuned = unknown
print_info: ssm_d_conv = 0
print_info: ssm_d_inner = 0
print_info: ssm_d_state = 0
print_info: ssm_dt_rank = 0
print_info: ssm_dt_b_c_rms = 0
print_info: model type = 14B
print_info: model params = 14.77 B
print_info: general.name = Qwen2.5 Coder 14B
print_info: vocab type = BPE
print_info: n_vocab = 152064
print_info: n_merges = 151387
print_info: BOS token = 151643 '<|endoftext|>'
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151643 '<|endoftext|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false)
load_tensors: offloading 48 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 49/49 layers to GPU
load_tensors: CPU model buffer size = 243.63 MiB
load_tensors: SYCL0 model buffer size = 5288.54 MiB
ggml-backend.cpp:265: GGML_ASSERT(offset + size <= ggml_nbytes(tensor) && "tensor write out of bounds") failed
[New LWP 10361]
[New LWP 10353]
[New LWP 10352]
[New LWP 10351]
[New LWP 10350]
[New LWP 10349]
[New LWP 10348]
[New LWP 10347]
[New LWP 10346]
[New LWP 10345]
warning: Missing auto-load script at offset 0 in section .debug_gdb_scripts
of file /home/davi/AI/ollama-ipex/ollama-ipex-llm-2.3.0b20250725-ubuntu/ollama-bin.
Use `info auto-load python-scripts [REGEXP]' to list them.
Function(s) ^std::(move|forward|as_const|(__)?addressof) will be skipped when stepping.
Function(s) ^std::(shared|unique)_ptr<.*>::(get|operator) will be skipped when stepping.
Function(s) ^std::(basic_string|vector|array|deque|(forward_)?list|(unordered_|flat_)?(multi)?(map|set)|span)<.*>::(c?r?(begin|end)|front|back|data|size|empty) will be skipped when stepping.
Function(s) ^std::(basic_string|vector|array|deque|span)<.*>::operator.] will be skipped when stepping.
time=2025-08-05T00:08:51.072-03:00 level=INFO source=server.go:652 msg="waiting for server to become available" status="llm server not responding"
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
runtime.futex () at /usr/local/go/src/runtime/sys_linux_amd64.s:558
warning: 558 /usr/local/go/src/runtime/sys_linux_amd64.s: No such file or directory
#0 runtime.futex () at /usr/local/go/src/runtime/sys_linux_amd64.s:558
558 in /usr/local/go/src/runtime/sys_linux_amd64.s
#1 0x000000000044cff0 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=4786851) at /usr/local/go/src/runtime/os_linux.go:75
warning: 75 /usr/local/go/src/runtime/os_linux.go: No such file or directory
#2 0x0000000000429087 in runtime.notesleep (n=0x1ecf900 <runtime.m0+320>) at /usr/local/go/src/runtime/lock_futex.go:47
warning: 47 /usr/local/go/src/runtime/lock_futex.go: No such file or directory
#3 0x00000000004587ac in runtime.mPark () at /usr/local/go/src/runtime/proc.go:1887
warning: 1887 /usr/local/go/src/runtime/proc.go: No such file or directory
#4 runtime.stopm () at /usr/local/go/src/runtime/proc.go:2910
2910 in /usr/local/go/src/runtime/proc.go
#5 0x000000000045a27c in runtime.findRunnable (gp=<optimized out>, inheritTime=<optimized out>, tryWakeP=<optimized out>) at /usr/local/go/src/runtime/proc.go:3647
3647 in /usr/local/go/src/runtime/proc.go
#6 0x000000000045b371 in runtime.schedule () at /usr/local/go/src/runtime/proc.go:4020
4020 in /usr/local/go/src/runtime/proc.go
#7 0x000000000045b825 in runtime.park_m (gp=0xc0001028c0) at /usr/local/go/src/runtime/proc.go:4144
4144 in /usr/local/go/src/runtime/proc.go
#8 0x000000000048cc6e in runtime.mcall () at /usr/local/go/src/runtime/asm_amd64.s:459
warning: 459 /usr/local/go/src/runtime/asm_amd64.s: No such file or directory
#9 0x00007ffd3dfa8388 in ?? ()
#10 0x000000000049167f in runtime.newproc (fn=0x48cb6f <runtime.rt0_go+303>) at <autogenerated>:1
warning: 1 <autogenerated>: No such file or directory
#11 0x000000000048cbe5 in runtime.mstart () at /usr/local/go/src/runtime/asm_amd64.s:395
warning: 395 /usr/local/go/src/runtime/asm_amd64.s: No such file or directory
#12 0x000000000048cb6f in runtime.rt0_go () at /usr/local/go/src/runtime/asm_amd64.s:358
358 in /usr/local/go/src/runtime/asm_amd64.s
#13 0x0000000000000011 in ?? ()
#14 0x00007ffd3dfa84e8 in ?? ()
#15 0x0000000000000006 in ?? ()
#16 0x0000000000000011 in ?? ()
#17 0x00007ffd3dfa84e8 in ?? ()
#18 0x00007cf4cc62a578 in __libc_start_call_main (main=0x0, argc=0, argv=0x0) at ../sysdeps/nptl/libc_start_call_main.h:58
warning: 58 ../sysdeps/nptl/libc_start_call_main.h: No such file or directory
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
[Inferior 1 (process 10344) detached]
time=2025-08-05T00:08:52.415-03:00 level=INFO source=server.go:652 msg="waiting for server to become available" status="llm server loading model"
SIGABRT: abort
PC=0x7cf4cc6a49bc m=4 sigcode=18446744073709551610
signal arrived during cgo execution
goroutine 13 gp=0xc000582a80 m=4 mp=0xc00008b808 [syscall]:
runtime.cgocall(0x1148360, 0xc00059b890)
/usr/local/go/src/runtime/cgocall.go:167 +0x4b fp=0xc00059b868 sp=0xc00059b830 pc=0x48398b
github.com/ollama/ollama/llama._Cfunc_llama_model_load_from_file(0x7cf464000d50, {0x0, 0x0, 0x3e7, 0x1, 0x0, 0x0, 0x11479d0, 0xc000352040, 0x0, ...})
_cgo_gotypes.go:876 +0x47 fp=0xc00059b890 sp=0xc00059b868 pc=0x833dc7
github.com/ollama/ollama/llama.LoadModelFromFile.func4(...)
/home/arda/ruonan/ollama-internal/llama/llama.go:296
github.com/ollama/ollama/llama.LoadModelFromFile({0x7ffd3dfa934f, 0x62}, {0x3e7, 0x0, 0x0, {0x0, 0x0, 0x0}, 0xc000503860, 0x0, ...})
/home/arda/ruonan/ollama-internal/llama/llama.go:296 +0x4d7 fp=0xc00059bd80 sp=0xc00059b890 pc=0x835ff7
github.com/ollama/ollama/runner/llamarunner.(*Server).loadModel(0xc000114360, {0x3e7, 0x0, 0x0, {0x0, 0x0, 0x0}, 0xc000503860, 0x0, {0x0, ...}}, ...)
/home/arda/ruonan/ollama-internal/runner/llamarunner/runner.go:749 +0x9e fp=0xc00059bee8 sp=0xc00059bd80 pc=0x8f2f3e
github.com/ollama/ollama/runner/llamarunner.Execute.gowrap1()
/home/arda/ruonan/ollama-internal/runner/llamarunner/runner.go:885 +0x115 fp=0xc00059bfe0 sp=0xc00059bee8 pc=0x8f4a15
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00059bfe8 sp=0xc00059bfe0 pc=0x48eca1
created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1
/home/arda/ruonan/ollama-internal/runner/llamarunner/runner.go:885 +0xd2a
goroutine 1 gp=0xc000002380 m=nil [IO wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc0004b55b8 sp=0xc0004b5598 pc=0x486e0e
runtime.netpollblock(0xc0004b5608?, 0x420706?, 0x0?)
/usr/local/go/src/runtime/netpoll.go:575 +0xf7 fp=0xc0004b55f0 sp=0xc0004b55b8 pc=0x44c2d7
internal/poll.runtime_pollWait(0x7cf4ccc93eb0, 0x72)
/usr/local/go/src/runtime/netpoll.go:351 +0x85 fp=0xc0004b5610 sp=0xc0004b55f0 pc=0x486025
internal/poll.(*pollDesc).wait(0xc000055700?, 0x900000036?, 0x0)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0004b5638 sp=0xc0004b5610 pc=0x50e347
internal/poll.(*pollDesc).waitRead(...)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0xc000055700)
/usr/local/go/src/internal/poll/fd_unix.go:620 +0x295 fp=0xc0004b56e0 sp=0xc0004b5638 pc=0x513715
net.(*netFD).accept(0xc000055700)
/usr/local/go/src/net/fd_unix.go:172 +0x29 fp=0xc0004b5798 sp=0xc0004b56e0 pc=0x585d89
net.(*TCPListener).accept(0xc00052ef40)
/usr/local/go/src/net/tcpsock_posix.go:159 +0x1b fp=0xc0004b57e8 sp=0xc0004b5798 pc=0x59b6fb
net.(*TCPListener).Accept(0xc00052ef40)
/usr/local/go/src/net/tcpsock.go:380 +0x30 fp=0xc0004b5818 sp=0xc0004b57e8 pc=0x59a5b0
net/http.(*onceCloseListener).Accept(0xc000114d80?)
<autogenerated>:1 +0x24 fp=0xc0004b5830 sp=0xc0004b5818 pc=0x7b25a4
net/http.(*Server).Serve(0xc000207500, {0x15e56c8, 0xc00052ef40})
/usr/local/go/src/net/http/server.go:3424 +0x30c fp=0xc0004b5960 sp=0xc0004b5830 pc=0x789dcc
github.com/ollama/ollama/runner/llamarunner.Execute({0xc000034140, 0xf, 0x10})
/home/arda/ruonan/ollama-internal/runner/llamarunner/runner.go:912 +0x11e9 fp=0xc0004b5d08 sp=0xc0004b5960 pc=0x8f4609
github.com/ollama/ollama/runner.Execute({0xc000034130?, 0x0?, 0x0?})
/home/arda/ruonan/ollama-internal/runner/runner.go:22 +0xd4 fp=0xc0004b5d30 sp=0xc0004b5d08 pc=0x979374
github.com/ollama/ollama/cmd.NewCLI.func2(0xc000207200?, {0x140da22?, 0x4?, 0x140da26?})
/home/arda/ruonan/ollama-internal/cmd/cmd.go:1529 +0x45 fp=0xc0004b5d58 sp=0xc0004b5d30 pc=0x10d5b45
github.com/spf13/cobra.(*Command).execute(0xc00011af08, {0xc0004e0e10, 0xf, 0xf})
/home/arda/go/pkg/mod/github.com/spf13/[email protected]/command.go:940 +0x894 fp=0xc0004b5e78 sp=0xc0004b5d58 pc=0x5ff694
github.com/spf13/cobra.(*Command).ExecuteC(0xc0004baf08)
/home/arda/go/pkg/mod/github.com/spf13/[email protected]/command.go:1068 +0x3a5 fp=0xc0004b5f30 sp=0xc0004b5e78 pc=0x5ffee5
github.com/spf13/cobra.(*Command).Execute(...)
/home/arda/go/pkg/mod/github.com/spf13/[email protected]/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
/home/arda/go/pkg/mod/github.com/spf13/[email protected]/command.go:985
main.main()
/home/arda/ruonan/ollama-internal/main.go:12 +0x4d fp=0xc0004b5f50 sp=0xc0004b5f30 pc=0x10d65cd
runtime.main()
/usr/local/go/src/runtime/proc.go:283 +0x28b fp=0xc0004b5fe0 sp=0xc0004b5f50 pc=0x45390b
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0004b5fe8 sp=0xc0004b5fe0 pc=0x48eca1
goroutine 2 gp=0xc000002e00 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000084fa8 sp=0xc000084f88 pc=0x486e0e
runtime.goparkunlock(...)
/usr/local/go/src/runtime/proc.go:441
runtime.forcegchelper()
/usr/local/go/src/runtime/proc.go:348 +0xb3 fp=0xc000084fe0 sp=0xc000084fa8 pc=0x453c53
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000084fe8 sp=0xc000084fe0 pc=0x48eca1
created by runtime.init.7 in goroutine 1
/usr/local/go/src/runtime/proc.go:336 +0x1a
goroutine 3 gp=0xc000003340 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000085780 sp=0xc000085760 pc=0x486e0e
runtime.goparkunlock(...)
/usr/local/go/src/runtime/proc.go:441
runtime.bgsweep(0xc0000b0000)
/usr/local/go/src/runtime/mgcsweep.go:316 +0xdf fp=0xc0000857c8 sp=0xc000085780 pc=0x43e45f
runtime.gcenable.gowrap1()
/usr/local/go/src/runtime/mgc.go:204 +0x25 fp=0xc0000857e0 sp=0xc0000857c8 pc=0x4328c5
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000857e8 sp=0xc0000857e0 pc=0x48eca1
created by runtime.gcenable in goroutine 1
/usr/local/go/src/runtime/mgc.go:204 +0x66
goroutine 4 gp=0xc000003500 m=nil [GC scavenge wait]:
runtime.gopark(0x10000?, 0x15d2cc8?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000085f78 sp=0xc000085f58 pc=0x486e0e
runtime.goparkunlock(...)
/usr/local/go/src/runtime/proc.go:441
runtime.(*scavengerState).park(0x1ecc9a0)
/usr/local/go/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc000085fa8 sp=0xc000085f78 pc=0x43bea9
runtime.bgscavenge(0xc0000b0000)
/usr/local/go/src/runtime/mgcscavenge.go:658 +0x59 fp=0xc000085fc8 sp=0xc000085fa8 pc=0x43c439
runtime.gcenable.gowrap2()
/usr/local/go/src/runtime/mgc.go:205 +0x25 fp=0xc000085fe0 sp=0xc000085fc8 pc=0x432865
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000085fe8 sp=0xc000085fe0 pc=0x48eca1
created by runtime.gcenable in goroutine 1
/usr/local/go/src/runtime/mgc.go:205 +0xa5
goroutine 5 gp=0xc000003dc0 m=nil [finalizer wait]:
runtime.gopark(0x1b8?, 0xc000002380?, 0x1?, 0x23?, 0xc000084688?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000084630 sp=0xc000084610 pc=0x486e0e
runtime.runfinq()
/usr/local/go/src/runtime/mfinal.go:196 +0x107 fp=0xc0000847e0 sp=0xc000084630 pc=0x431887
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000847e8 sp=0xc0000847e0 pc=0x48eca1
created by runtime.createfing in goroutine 1
/usr/local/go/src/runtime/mfinal.go:166 +0x3d
goroutine 6 gp=0xc0001e48c0 m=nil [chan receive]:
runtime.gopark(0xc0001e1900?, 0xc000116018?, 0x60?, 0x67?, 0x56cac8?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000086718 sp=0xc0000866f8 pc=0x486e0e
runtime.chanrecv(0xc0000be310, 0x0, 0x1)
/usr/local/go/src/runtime/chan.go:664 +0x445 fp=0xc000086790 sp=0xc000086718 pc=0x4232a5
runtime.chanrecv1(0x0?, 0x0?)
/usr/local/go/src/runtime/chan.go:506 +0x12 fp=0xc0000867b8 sp=0xc000086790 pc=0x422e32
runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
/usr/local/go/src/runtime/mgc.go:1796
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
/usr/local/go/src/runtime/mgc.go:1799 +0x2f fp=0xc0000867e0 sp=0xc0000867b8 pc=0x435a0f
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000867e8 sp=0xc0000867e0 pc=0x48eca1
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
/usr/local/go/src/runtime/mgc.go:1794 +0x79
goroutine 7 gp=0xc0001e4e00 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000086f38 sp=0xc000086f18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc000086fc8 sp=0xc000086f38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc000086fe0 sp=0xc000086fc8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000086fe8 sp=0xc000086fe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 8 gp=0xc0001e4fc0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000087738 sp=0xc000087718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc0000877c8 sp=0xc000087738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc0000877e0 sp=0xc0000877c8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000877e8 sp=0xc0000877e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 18 gp=0xc000102380 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000080738 sp=0xc000080718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc0000807c8 sp=0xc000080738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc0000807e0 sp=0xc0000807c8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000807e8 sp=0xc0000807e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 34 gp=0xc000504000 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00050a738 sp=0xc00050a718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc00050a7c8 sp=0xc00050a738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc00050a7e0 sp=0xc00050a7c8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00050a7e8 sp=0xc00050a7e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 35 gp=0xc0005041c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00050af38 sp=0xc00050af18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc00050afc8 sp=0xc00050af38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc00050afe0 sp=0xc00050afc8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00050afe8 sp=0xc00050afe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 36 gp=0xc000504380 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00050b738 sp=0xc00050b718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc00050b7c8 sp=0xc00050b738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc00050b7e0 sp=0xc00050b7c8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00050b7e8 sp=0xc00050b7e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 37 gp=0xc000504540 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00050bf38 sp=0xc00050bf18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc00050bfc8 sp=0xc00050bf38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc00050bfe0 sp=0xc00050bfc8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00050bfe8 sp=0xc00050bfe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 9 gp=0xc0001e5180 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000087f38 sp=0xc000087f18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc000087fc8 sp=0xc000087f38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc000087fe0 sp=0xc000087fc8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000087fe8 sp=0xc000087fe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 19 gp=0xc000102540 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000080f38 sp=0xc000080f18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc000080fc8 sp=0xc000080f38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc000080fe0 sp=0xc000080fc8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000080fe8 sp=0xc000080fe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 10 gp=0xc0001e5340 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000506738 sp=0xc000506718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc0005067c8 sp=0xc000506738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc0005067e0 sp=0xc0005067c8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0005067e8 sp=0xc0005067e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 38 gp=0xc000504700 m=nil [GC worker (idle)]:
runtime.gopark(0x9307403b5e?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00050c738 sp=0xc00050c718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc00050c7c8 sp=0xc00050c738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc00050c7e0 sp=0xc00050c7c8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00050c7e8 sp=0xc00050c7e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 11 gp=0xc0001e5500 m=nil [GC worker (idle)]:
runtime.gopark(0x93073f2870?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000506f38 sp=0xc000506f18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc000506fc8 sp=0xc000506f38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc000506fe0 sp=0xc000506fc8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000506fe8 sp=0xc000506fe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 20 gp=0xc000102700 m=nil [GC worker (idle)]:
runtime.gopark(0x93073f2667?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000081738 sp=0xc000081718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc0000817c8 sp=0xc000081738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc0000817e0 sp=0xc0000817c8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000817e8 sp=0xc0000817e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 39 gp=0xc0005048c0 m=nil [GC worker (idle)]:
runtime.gopark(0x1f7a980?, 0x1?, 0x4f?, 0x4f?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00050cf38 sp=0xc00050cf18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc00050cfc8 sp=0xc00050cf38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc00050cfe0 sp=0xc00050cfc8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00050cfe8 sp=0xc00050cfe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 40 gp=0xc000504a80 m=nil [GC worker (idle)]:
runtime.gopark(0x93073f4927?, 0x1?, 0xf3?, 0x41?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00050d738 sp=0xc00050d718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc00050d7c8 sp=0xc00050d738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc00050d7e0 sp=0xc00050d7c8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00050d7e8 sp=0xc00050d7e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 12 gp=0xc0001e56c0 m=nil [GC worker (idle)]:
runtime.gopark(0x93073f619d?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000507738 sp=0xc000507718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc0000bf8f0)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc0005077c8 sp=0xc000507738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc0005077e0 sp=0xc0005077c8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0005077e8 sp=0xc0005077e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 14 gp=0xc000582c40 m=nil [sync.WaitGroup.Wait]:
runtime.gopark(0x0?, 0x0?, 0x60?, 0xc0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000083e20 sp=0xc000083e00 pc=0x486e0e
runtime.goparkunlock(...)
/usr/local/go/src/runtime/proc.go:441
runtime.semacquire1(0xc000114368, 0x0, 0x1, 0x0, 0x18)
/usr/local/go/src/runtime/sema.go:188 +0x21d fp=0xc000083e88 sp=0xc000083e20 pc=0x466dfd
sync.runtime_SemacquireWaitGroup(0x0?)
/usr/local/go/src/runtime/sema.go:110 +0x25 fp=0xc000083ec0 sp=0xc000083e88 pc=0x488805
sync.(*WaitGroup).Wait(0x0?)
/usr/local/go/src/sync/waitgroup.go:118 +0x48 fp=0xc000083ee8 sp=0xc000083ec0 pc=0x49a5e8
github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc000114360, {0x15e7cf0, 0xc000514eb0})
/home/arda/ruonan/ollama-internal/runner/llamarunner/runner.go:314 +0x47 fp=0xc000083fb8 sp=0xc000083ee8 pc=0x8efc67
github.com/ollama/ollama/runner/llamarunner.Execute.gowrap2()
/home/arda/ruonan/ollama-internal/runner/llamarunner/runner.go:892 +0x28 fp=0xc000083fe0 sp=0xc000083fb8 pc=0x8f48c8
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000083fe8 sp=0xc000083fe0 pc=0x48eca1
created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1
/home/arda/ruonan/ollama-internal/runner/llamarunner/runner.go:892 +0xe05
goroutine 64 gp=0xc000582e00 m=nil [IO wait]:
runtime.gopark(0x511945?, 0xc000055900?, 0x40?, 0xfa?, 0xb?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00014f948 sp=0xc00014f928 pc=0x486e0e
runtime.netpollblock(0x4aa8b8?, 0x420706?, 0x0?)
/usr/local/go/src/runtime/netpoll.go:575 +0xf7 fp=0xc00014f980 sp=0xc00014f948 pc=0x44c2d7
internal/poll.runtime_pollWait(0x7cf4ccc93a50, 0x72)
/usr/local/go/src/runtime/netpoll.go:351 +0x85 fp=0xc00014f9a0 sp=0xc00014f980 pc=0x486025
internal/poll.(*pollDesc).wait(0xc000055900?, 0xc000148000?, 0x0)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00014f9c8 sp=0xc00014f9a0 pc=0x50e347
internal/poll.(*pollDesc).waitRead(...)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc000055900, {0xc000148000, 0x1000, 0x1000})
/usr/local/go/src/internal/poll/fd_unix.go:165 +0x27a fp=0xc00014fa60 sp=0xc00014f9c8 pc=0x50f63a
net.(*netFD).Read(0xc000055900, {0xc000148000?, 0xc00014fad0?, 0x50e805?})
/usr/local/go/src/net/fd_posix.go:55 +0x25 fp=0xc00014faa8 sp=0xc00014fa60 pc=0x583de5
net.(*conn).Read(0xc000088928, {0xc000148000?, 0x0?, 0x0?})
/usr/local/go/src/net/net.go:194 +0x45 fp=0xc00014faf0 sp=0xc00014faa8 pc=0x592185
net/http.(*connReader).Read(0xc000119140, {0xc000148000, 0x1000, 0x1000})
/usr/local/go/src/net/http/server.go:798 +0x159 fp=0xc00014fb40 sp=0xc00014faf0 pc=0x77ec79
bufio.(*Reader).fill(0xc000110660)
/usr/local/go/src/bufio/bufio.go:113 +0x103 fp=0xc00014fb78 sp=0xc00014fb40 pc=0x5a9903
bufio.(*Reader).Peek(0xc000110660, 0x4)
/usr/local/go/src/bufio/bufio.go:152 +0x53 fp=0xc00014fb98 sp=0xc00014fb78 pc=0x5a9a33
net/http.(*conn).serve(0xc000114d80, {0x15e7cb8, 0xc000118720})
/usr/local/go/src/net/http/server.go:2137 +0x785 fp=0xc00014ffb8 sp=0xc00014fb98 pc=0x784a65
net/http.(*Server).Serve.gowrap3()
/usr/local/go/src/net/http/server.go:3454 +0x28 fp=0xc00014ffe0 sp=0xc00014ffb8 pc=0x78a1c8
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00014ffe8 sp=0xc00014ffe0 pc=0x48eca1
created by net/http.(*Server).Serve in goroutine 1
/usr/local/go/src/net/http/server.go:3454 +0x485
rax 0x0
rbx 0x286b
rcx 0x7cf4cc6a49bc
rdx 0x6
rdi 0x2868
rsi 0x286b
rbp 0x7cf46e9fb6e0
rsp 0x7cf46e9fb6a0
r8 0x0
r9 0x0
r10 0xf11ed7d
r11 0x246
r12 0x6
r13 0x1652e66
r14 0x16
r15 0xaf0000
rip 0x7cf4cc6a49bc
rflags 0x246
cs 0x33
fs 0x0
gs 0x0
time=2025-08-05T00:08:52.591-03:00 level=ERROR source=server.go:484 msg="llama runner terminated" error="exit status 2"
time=2025-08-05T00:08:52.666-03:00 level=ERROR source=sched.go:489 msg="error loading llama server" error="llama runner process has terminated: GGML_ASSERT(offset + size <= ggml_nbytes(tensor) && \"tensor write out of bounds\") failed"
[GIN] 2025/08/05 - 00:08:52 | 500 | 10.562076295s | 127.0.0.1 | POST "/api/generate"
It is working on 2.2.0, so something between that and 2.3.0b20250802 happened that caused this issue.