LocalAI icon indicating copy to clipboard operation
LocalAI copied to clipboard

docs: Offload/stop backend API

Open t3hk0d3 opened this issue 9 months ago • 8 comments

Is your feature request related to a problem? Please describe. GPU resources are limited. Once model is loaded into GPU memory you can't use any other big model until previous backends are stopped. Currently if backend is loaded and running its impossible to unload it using API. I have to docker exec into container and kill backend process.

Describe the solution you'd like Create an endpoint to unload specific backend or all backends.

GET /v1/backends - list running backends DELETE /v1/backends/<backend_id> - unload backend DELETE /v1/backends - unload ALL backends

t3hk0d3 avatar May 22 '24 10:05 t3hk0d3