aikit
aikit copied to clipboard
Add preload_models configuration option to load models into memory on startup
This PR implements a new preload_models configuration option that enables loading models into memory when the LocalAI server starts, improving response times by eliminating the initial model loading delay on first request.
Changes Made
Configuration Enhancement:
- Added
preload_modelsboolean field to theInferenceConfigstruct - Updated YAML parsing to support the new configuration option
LocalAI Integration:
- Implemented
generateLocalAIConfig()function to create appropriate preload configuration for LocalAI - Modified container image configuration to automatically include
--config-file=/config.yamlwhen preload is enabled - Generates proper preload configuration format with model IDs and filenames
Usage Example:
apiVersion: v1alpha1
debug: true
preload_models: true
models:
- name: llama-3.2-1b-instruct
source: https://huggingface.co/MaziyarPanahi/Llama-3.2-1B-Instruct-GGUF/resolve/main/Llama-3.2-1B-Instruct.Q4_K_M.gguf
sha256: "e4650dd6b45ef456066b11e4927f775eef4dd1e0e8473c3c0f27dd19ee13cc4e"
Generated LocalAI Configuration:
When preload_models: true is set, AIKit automatically generates:
preload_models:
- id: llama-3.2-1b-instruct
name: Llama-3.2-1B-Instruct.Q4_K_M.gguf
preload: true
Benefits
- Faster Response Times: Models are loaded into memory on startup rather than on first request
- Improved User Experience: Eliminates the "cold start" delay when making the first inference request
- Production Ready: Works with all supported model sources (HTTP, OCI, local files)
- Backward Compatible: Existing configurations continue to work unchanged
Testing
- Added comprehensive unit tests for configuration parsing and LocalAI config generation
- Updated existing tests to cover the new functionality
- Added documentation with examples and usage guidance
- All tests pass and linting is clean
Fixes #613.
[!WARNING]
Firewall rules blocked me from connecting to one or more addresses (expand for details)
I tried to connect to the following addresses, but was blocked by firewall rules:
https://api.github.com/repos/mudler/LocalAI/releases/tags/v3.4.0
- Triggering command:
curl -s REDACTED(http block)localai.io
- Triggering command:
curl -s REDACTED(dns block)If you need me to access, download, or install something from one of these locations, you can either:
- Configure Actions setup steps to set up my environment, which run before the firewall is enabled
- Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.