mistral.rs icon indicating copy to clipboard operation
mistral.rs copied to clipboard

Implement dynamic LoRA swapping

Open EricLBuehler opened this issue 9 months ago • 1 comments

Dynamic LoRA swapping, first raised in #259, enables the user to dynamically set active LoRA adapters. This can be configured per-request to enable users to add their own routing functionality.

Usage

Pre-loading

Adapters may be pre-loaded (but not activated) to remove runtime cost for loading adapters. The adapters to be pre-loaded must all share the same ordering. Therefore, a logical place to specify them is the LoRA ordering file, and should be done as such in the preload_adapters field:

{
    "order": ["..."],
    "layers": {"...": "123"},
    "base_model_id": "...",
    "preload_adapters": [{"name": "...", "adapter_model_id": "..."}] # New field here
}

Runtime APIs

APIs to dynamically activate LoRA adapters by name are exposed in the HTTP server, Rust, and Python APIs.

EricLBuehler avatar May 03 '24 11:05 EricLBuehler

Code Metrics Report
  ───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
Rust                        72     23863     1572       530    21761       1325
───────────────────────────────────────────────────────────────────────────────
Total                       72     23863     1572       530    21761       1325
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop 85,737
Estimated Schedule Effort 11.916649 months
Estimated People Required 5.112342
───────────────────────────────────────────────────────────────────────────────
Processed 793364 bytes, 0.793 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────
  

github-actions[bot] avatar May 03 '24 11:05 github-actions[bot]