mistral.rs
mistral.rs copied to clipboard
Implement dynamic LoRA swapping
Dynamic LoRA swapping, first raised in #259, enables the user to dynamically set active LoRA adapters. This can be configured per-request to enable users to add their own routing functionality.
Usage
Pre-loading
Adapters may be pre-loaded (but not activated) to remove runtime cost for loading adapters. The adapters to be pre-loaded must all share the same ordering. Therefore, a logical place to specify them is the LoRA ordering file, and should be done as such in the preload_adapters
field:
{
"order": ["..."],
"layers": {"...": "123"},
"base_model_id": "...",
"preload_adapters": [{"name": "...", "adapter_model_id": "..."}] # New field here
}
Runtime APIs
APIs to dynamically activate LoRA adapters by name are exposed in the HTTP server, Rust, and Python APIs.
Code Metrics Report
─────────────────────────────────────────────────────────────────────────────── Language Files Lines Blanks Comments Code Complexity ─────────────────────────────────────────────────────────────────────────────── Rust 72 23863 1572 530 21761 1325 ─────────────────────────────────────────────────────────────────────────────── Total 72 23863 1572 530 21761 1325 ─────────────────────────────────────────────────────────────────────────────── Estimated Cost to Develop 85,737 Estimated Schedule Effort 11.916649 months Estimated People Required 5.112342 ─────────────────────────────────────────────────────────────────────────────── Processed 793364 bytes, 0.793 megabytes (SI) ───────────────────────────────────────────────────────────────────────────────