llama.cpp
llama.cpp copied to clipboard
Eval bug: context shift is disabled
Name and Version
./llama.cpp/build/bin/llama-server \
-m /models/DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf \
--cache-type-k q4_0 \
--threads 64 \
--temp 0.6 \
--ctx-size 12288 \
--parallel 3 \
--n-gpu-layers 62
Operating systems
Linux
GGML backends
BLAS
Hardware
AMD EPYC 9754 128-Core Processor 8 * RTX 4090D 24G
Models
DeepSeek-R1-UD-IQ1_S
Problem description & steps to reproduce
Error message after multiple rounds of conversation: context shift is disabled。
First Bad Commit
No response
Relevant log output
slot launch_slot_: id 1 | task 4640 | processing task
slot update_slots: id 1 | task 4640 | new prompt, n_ctx_slot = 4096, n_keep = 0, n_prompt_tokens = 3844
slot update_slots: id 1 | task 4640 | kv cache rm [3798, end)
slot update_slots: id 1 | task 4640 | prompt processing progress, n_past = 3844, n_tokens = 46, progress = 0.011967
slot update_slots: id 1 | task 4640 | prompt done, n_past = 3844, n_tokens = 46
slot release: id 1 | task 4640 | stop processing: n_past = 4095, truncated = 0
srv send_error: task id = 4640, error: context shift is disabled
srv update_slots: no tokens to decode
srv update_slots: all slots are idle
srv cancel_tasks: cancel task, id_task = 4640
srv update_slots: all slots are idle
srv log_server_r: request: POST /v1/chat/completions 127.0.0.1 200