llama_cpp_dart icon indicating copy to clipboard operation
llama_cpp_dart copied to clipboard

Prompt token count (3784) exceeds batch capacity (2048)

Open hg0428 opened this issue 1 month ago • 1 comments

[ERROR:flutter/runtime/dart_vm_initializer.cc(40)] Unhandled Exception: Generation error: LlamaException: Prompt token count (3784) exceeds batch capacity (2048)

Why is it doing this? The context size is sufficiently high. Batch size just determines how much should be in each batch. If there are more, it should run multiple batches. In the Llama.cpp CLI it usually works fine, right? Does this mean we need to set the batch size to the max context length to even get it to work? What's going on?

hg0428 avatar Nov 30 '25 18:11 hg0428

yes that correct, please look at examples folder

netdur avatar Nov 30 '25 18:11 netdur

yes that correct, please look at examples folder

I believe you should be able to set context size higher than batch capacity and still be able to utilize the full context, right? If not, please explain.

hg0428 avatar Dec 04 '25 16:12 hg0428