lorax
                                
                                
                                
                                    lorax copied to clipboard
                            
                            
                            
                        Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
https://platform.openai.com/docs/api-reference/batch
### System Info `lorax-client==0.5.0` ### Information - [ ] Docker - [ ] The CLI directly ### Tasks - [ ] An officially supported command - [ ] My own...
Can I deploy the service using Lorax without using lorax-launcher to start, and instead load the model in the code? Similar to HF and VLLM, I can use the following...
### Model description Do you have considered to support multi modalmodels in the near future? Thanks ### Open source status - [X] The model implementation is available - [X] The...
Somehow PyTorch is using a 12.1 binary, overriding the 11.8 we install explicitly. Likely a dep somewhere is causing the override. It doesn't seem to be leading to instability, but...
### System Info lorax 0.9.0, running with docker. ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command - [ ]...
### Feature request Currently, every lora layer would be moved from CPU to target device of base model, results in extra 20ms in each layer, finally 500ms ~ 1+s latency...
### System Info Mixtral model 4 A100 GPUs, each 80G ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command -...
### Feature request Is it possible to combine multiple LoRA adapters like you might do to combine multiple styles with Stable Diffusion? ### Motivation I think we could get higher...
### System Info ``` +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan...