lorax icon indicating copy to clipboard operation
lorax copied to clipboard

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Results 185 lorax issues
Sort by recently updated
recently updated
newest added

### Feature request Recent mistral models inlcuding mistral 7b v0.3 instruct have consolidated.safetensors which have different weights key names compared to what LoRAx expects. Also there are keys like lm_head,...

### Feature request if LoRAX is based on punica kernels will it be able to support LoRA Adapters for Mistral NeMO 12B? which has a vocab size > 130k. Currently...

### System Info ghcr.io/predibase/lorax:24cb494 ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command - [ ] My own modifications ###...

### System Info When using predibase serverless I see stop words included in the stream. I assumed it is supposed to stop and not include them ### Information - [...

### System Info We are using streaming v1 chat completions API. After some amount of requests or a request with large enough context lorax server fails to respond. And all...

This should prevent some nasty illegal memory access errors 1. Consolidate individual list comprehensions into a single for loop 2. Distinct code to create the lora weight pointers tensor 3....

### System Info I am trying to run a qwen2-7b-instruct with AWQ quantized in a kubernetes environment. GPU is single T4 (16 GB VRAM). I see that it is unable...

Previously, when loading a base model from s3: `--source s3 --model-id s3://bucket/model` The model would be downloaded to the cache path `/data/models--model`. However, when the base model is first loaded,...

# What does this PR do? 1. Re-organize the code in BatchLoraWeights.load. This function was a bit hard to understand as there were multiple list comprehensions with almost same looping...

### Model description OpenAI whisper model eg. medium.en ### Open source status - [x] The model implementation is available - [X] The model weights are available ### Provide useful links...