LocalAI feat: respect context and add request cancellation

feat: respect context and add request cancellation

Open mudler opened this issue 1 month ago • 3 comments

Description

This PR binds the token generation to the request context, and for llama.cpp it implements job cancellation.

It also adds the stop icon now in place of the loading icon, that will abort the request.

Notes for Reviewers

Signed commits

[ ] Yes, I signed my commits.

Nov 07 '25 20:11 mudler

Deploy Preview for localai ready!

Name	Link
Latest commit	4839d572931cb247fedeabcb41ccffe991bd48f3
Latest deploy log	https://app.netlify.com/projects/localai/deploys/6910be954e2384000832362a
Deploy Preview	https://deploy-preview-7187--localai.netlify.app
Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Nov 07 '25 20:11 netlify[bot]

Seems we can't propagate client disconnection during non-SSE requests due to https://github.com/valyala/fasthttp/issues/468 , also affects go-fiber: https://github.com/gofiber/fiber/issues/1718

Nov 08 '25 17:11 mudler

just as a note, echo doesn't have such issues: https://github.com/labstack/echo/issues/1581

Nov 08 '25 17:11 mudler

found an ugly workaround, but works for our case. Would be nice if fasthttp supports this natively, but I guess for now that's the only way we can tackle this.

Nov 09 '25 17:11 mudler

LocalAI LocalAI copied to clipboard

feat: respect context and add request cancellation

✅ Deploy Preview for localai ready!

LocalAI
LocalAI copied to clipboard

Deploy Preview for localai ready!