lorax icon indicating copy to clipboard operation
lorax copied to clipboard

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Results 178 lorax issues
Sort by recently updated
recently updated
newest added

https://platform.openai.com/docs/api-reference/batch

enhancement

### System Info `lorax-client==0.5.0` ### Information - [ ] Docker - [ ] The CLI directly ### Tasks - [ ] An officially supported command - [ ] My own...

bug

Can I deploy the service using Lorax without using lorax-launcher to start, and instead load the model in the code? Similar to HF and VLLM, I can use the following...

documentation
enhancement

### Model description Do you have considered to support multi modalmodels in the near future? Thanks ### Open source status - [X] The model implementation is available - [X] The...

enhancement

Somehow PyTorch is using a 12.1 binary, overriding the 11.8 we install explicitly. Likely a dep somewhere is causing the override. It doesn't seem to be leading to instability, but...

bug

### System Info lorax 0.9.0, running with docker. ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command - [ ]...

### Feature request Currently, every lora layer would be moved from CPU to target device of base model, results in extra 20ms in each layer, finally 500ms ~ 1+s latency...

enhancement

### System Info Mixtral model 4 A100 GPUs, each 80G ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command -...

bug

### Feature request Is it possible to combine multiple LoRA adapters like you might do to combine multiple styles with Stable Diffusion? ### Motivation I think we could get higher...

question

### System Info ``` +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan...

enhancement