Document CAP_KILL requirement for Kubernetes deployments with restrictive security contexts

Open Copilot opened this issue 1 month ago • 1 comments

LocalAI uses signals to terminate backend child processes (llama.cpp, diffusers, whisper). When running in Kubernetes with security contexts that drop all capabilities, the container cannot send signals, causing VRAM leaks and orphaned processes with permission denied errors.

Changes

Added documentation to docs/content/getting-started/kubernetes.md and docs/content/installation/kubernetes.md:

Security Context Requirements: Explains CAP_KILL necessity and provides example Pod configuration with restrictive security settings
Troubleshooting: Documents symptoms, root causes, and verification steps for the permission denied error

Example Configuration

securityContext:
  allowPrivilegeEscalation: false
  capabilities:
    drop:
      - ALL
    add:
      - KILL  # Required for LocalAI to stop backend processes
  seccompProfile:
    type: RuntimeDefault

Without CAP_KILL, LocalAI's go-processmanager library cannot invoke process.Stop() which sends SIGTERM/SIGKILL to child processes, leaving them orphaned and holding GPU memory.

Original prompt

Problem

When running LocalAI in Kubernetes with restrictive security contexts (such as dropping all capabilities or using seccomp profiles), the backend child processes cannot be properly terminated when models are stopped. This results in:

VRAM not being freed when stopping models

Error messages like (deleteProcess) error while deleting process error=permission denied

Child processes remaining alive and holding GPU memory

This issue was identified in https://github.com/mudler/LocalAI/issues/7958

Root Cause

LocalAI uses syscall.SIGTERM and syscall.SIGKILL signals to terminate backend processes (via the go-processmanager library). When running in Kubernetes with restrictive security contexts that drop the CAP_KILL capability, the container cannot send signals to child processes.

Solution

Update the Kubernetes documentation (docs/content/getting-started/kubernetes.md and docs/content/installation/kubernetes.md) to include:

A new section on Security Context Requirements explaining that LocalAI needs the CAP_KILL capability to properly manage backend processes

Example deployment YAML showing the correct security context configuration

Troubleshooting information for users who encounter the "permission denied" error when stopping models

Proposed Documentation Changes

Add a new section to the Kubernetes docs that includes:

Security Context Requirements

LocalAI spawns child processes to run model backends (e.g., llama.cpp, diffusers). To properly stop these processes and free resources like VRAM, LocalAI needs permission to send signals to its child processes.

If you're using restrictive security contexts, ensure the CAP_KILL capability is available:
securityContext:
  allowPrivilegeEscalation: false
  capabilities:
    drop:
      - ALL
    add:
      - KILL  # Required for LocalAI to stop backend processes
  seccompProfile:
    type: RuntimeDefault
Troubleshooting

Issue: VRAM is not freed when stopping models, and logs show error while deleting process error=permission denied

Cause: The container lacks permission to send signals to child processes. This commonly happens when:

All capabilities are dropped without adding back CAP_KILL

Using user namespacing (hostUsers: false) with certain configurations

Overly restrictive seccomp profiles

Solution: Add the KILL capability to your container's security context as shown above. If running in privileged mode works but the above doesn't, check your cluster's Pod Security Policies or Pod Security Standards for additional restrictions.

Files to Modify

docs/content/getting-started/kubernetes.md

docs/content/installation/kubernetes.md (if it exists separately, keep them in sync)

This pull request was created from Copilot chat.

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Jan 10 '26 18:01 Copilot

Deploy Preview for localai ready!

Name	Link
Latest commit	a1480707df692cb17552a9ae28fd16997ebff14d
Latest deploy log	https://app.netlify.com/projects/localai/deploys/69629ef86fc4c70008df987d
Deploy Preview	https://deploy-preview-7961--localai.netlify.app
Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Jan 10 '26 18:01 netlify[bot]

Document CAP_KILL requirement for Kubernetes deployments with restrictive security contexts

Changes

Example Configuration

Problem

Root Cause

Solution

Proposed Documentation Changes

Security Context Requirements

Troubleshooting

Files to Modify

✅ Deploy Preview for localai ready!

Deploy Preview for localai ready!