aikit icon indicating copy to clipboard operation
aikit copied to clipboard

Add preload_models configuration option to load models into memory on startup

Open Copilot opened this issue 6 months ago • 0 comments

This PR implements a new preload_models configuration option that enables loading models into memory when the LocalAI server starts, improving response times by eliminating the initial model loading delay on first request.

Changes Made

Configuration Enhancement:

  • Added preload_models boolean field to the InferenceConfig struct
  • Updated YAML parsing to support the new configuration option

LocalAI Integration:

  • Implemented generateLocalAIConfig() function to create appropriate preload configuration for LocalAI
  • Modified container image configuration to automatically include --config-file=/config.yaml when preload is enabled
  • Generates proper preload configuration format with model IDs and filenames

Usage Example:

apiVersion: v1alpha1
debug: true
preload_models: true
models:
  - name: llama-3.2-1b-instruct
    source: https://huggingface.co/MaziyarPanahi/Llama-3.2-1B-Instruct-GGUF/resolve/main/Llama-3.2-1B-Instruct.Q4_K_M.gguf
    sha256: "e4650dd6b45ef456066b11e4927f775eef4dd1e0e8473c3c0f27dd19ee13cc4e"

Generated LocalAI Configuration: When preload_models: true is set, AIKit automatically generates:

preload_models:
  - id: llama-3.2-1b-instruct
    name: Llama-3.2-1B-Instruct.Q4_K_M.gguf
    preload: true

Benefits

  • Faster Response Times: Models are loaded into memory on startup rather than on first request
  • Improved User Experience: Eliminates the "cold start" delay when making the first inference request
  • Production Ready: Works with all supported model sources (HTTP, OCI, local files)
  • Backward Compatible: Existing configurations continue to work unchanged

Testing

  • Added comprehensive unit tests for configuration parsing and LocalAI config generation
  • Updated existing tests to cover the new functionality
  • Added documentation with examples and usage guidance
  • All tests pass and linting is clean

Fixes #613.

[!WARNING]

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • https://api.github.com/repos/mudler/LocalAI/releases/tags/v3.4.0
    • Triggering command: curl -s REDACTED (http block)
  • localai.io
    • Triggering command: curl -s REDACTED (dns block)

If you need me to access, download, or install something from one of these locations, you can either:


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot avatar Aug 22 '25 05:08 Copilot