llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Feature Request: Webui: Add Continue Generation

Open wateryuen opened this issue 1 month ago • 0 comments

Prerequisites

  • [x] I am running the latest code. Mention the version if possible as well.
  • [x] I carefully followed the README.md.
  • [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • [x] I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

The ability to continue generation after a stop. which can Recovering from accidental interruptions , and Editing or rewriting hallucinated output, then continuing the reasoning Please consider adding a "Continue" button in the WebUI that allows users to resume generation from the last stopped point.

Motivation

editing or rewriting hallucinated output, then continuing the reasoning seamlessly and recovering from accidental interruptions without restarting the entire prompt

Currently, I must either manually reconstruct the prompt or re-inject the previous output to resume generation. For models that rely on multi-step CoT or thinking model, this approach consumes a significant number of tokens and may degrade performance. A native "Continue Generation" feature would preserve context more efficiently.(which are very slow my pc only have 3 token/s )

Possible Implementation

No response . But i see LM studio can do it

wateryuen avatar Nov 11 '25 10:11 wateryuen