LocalAI icon indicating copy to clipboard operation
LocalAI copied to clipboard

Document CAP_KILL requirement for Kubernetes deployments with restrictive security contexts

Open Copilot opened this issue 1 month ago • 1 comments

LocalAI uses signals to terminate backend child processes (llama.cpp, diffusers, whisper). When running in Kubernetes with security contexts that drop all capabilities, the container cannot send signals, causing VRAM leaks and orphaned processes with permission denied errors.

Changes

Added documentation to docs/content/getting-started/kubernetes.md and docs/content/installation/kubernetes.md:

  • Security Context Requirements: Explains CAP_KILL necessity and provides example Pod configuration with restrictive security settings
  • Troubleshooting: Documents symptoms, root causes, and verification steps for the permission denied error

Example Configuration

securityContext:
  allowPrivilegeEscalation: false
  capabilities:
    drop:
      - ALL
    add:
      - KILL  # Required for LocalAI to stop backend processes
  seccompProfile:
    type: RuntimeDefault

Without CAP_KILL, LocalAI's go-processmanager library cannot invoke process.Stop() which sends SIGTERM/SIGKILL to child processes, leaving them orphaned and holding GPU memory.

Original prompt

Problem

When running LocalAI in Kubernetes with restrictive security contexts (such as dropping all capabilities or using seccomp profiles), the backend child processes cannot be properly terminated when models are stopped. This results in:

  1. VRAM not being freed when stopping models
  2. Error messages like (deleteProcess) error while deleting process error=permission denied
  3. Child processes remaining alive and holding GPU memory

This issue was identified in https://github.com/mudler/LocalAI/issues/7958

Root Cause

LocalAI uses syscall.SIGTERM and syscall.SIGKILL signals to terminate backend processes (via the go-processmanager library). When running in Kubernetes with restrictive security contexts that drop the CAP_KILL capability, the container cannot send signals to child processes.

Solution

Update the Kubernetes documentation (docs/content/getting-started/kubernetes.md and docs/content/installation/kubernetes.md) to include:

  1. A new section on Security Context Requirements explaining that LocalAI needs the CAP_KILL capability to properly manage backend processes
  2. Example deployment YAML showing the correct security context configuration
  3. Troubleshooting information for users who encounter the "permission denied" error when stopping models

Proposed Documentation Changes

Add a new section to the Kubernetes docs that includes:

Security Context Requirements

LocalAI spawns child processes to run model backends (e.g., llama.cpp, diffusers). To properly stop these processes and free resources like VRAM, LocalAI needs permission to send signals to its child processes.

If you're using restrictive security contexts, ensure the CAP_KILL capability is available:

securityContext:
  allowPrivilegeEscalation: false
  capabilities:
    drop:
      - ALL
    add:
      - KILL  # Required for LocalAI to stop backend processes
  seccompProfile:
    type: RuntimeDefault

Troubleshooting

Issue: VRAM is not freed when stopping models, and logs show error while deleting process error=permission denied

Cause: The container lacks permission to send signals to child processes. This commonly happens when:

  • All capabilities are dropped without adding back CAP_KILL
  • Using user namespacing (hostUsers: false) with certain configurations
  • Overly restrictive seccomp profiles

Solution: Add the KILL capability to your container's security context as shown above. If running in privileged mode works but the above doesn't, check your cluster's Pod Security Policies or Pod Security Standards for additional restrictions.

Files to Modify

  • docs/content/getting-started/kubernetes.md
  • docs/content/installation/kubernetes.md (if it exists separately, keep them in sync)

This pull request was created from Copilot chat.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot avatar Jan 10 '26 18:01 Copilot

Deploy Preview for localai ready!

Name Link
Latest commit a1480707df692cb17552a9ae28fd16997ebff14d
Latest deploy log https://app.netlify.com/projects/localai/deploys/69629ef86fc4c70008df987d
Deploy Preview https://deploy-preview-7961--localai.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

netlify[bot] avatar Jan 10 '26 18:01 netlify[bot]