lorax icon indicating copy to clipboard operation
lorax copied to clipboard

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Results 178 lorax issues
Sort by recently updated
recently updated
newest added
trafficstars

### Feature request Use something like streamlit to run a UI that can be used to query the deployments ### Motivation Would be a fun addition and allows people to...

enhancement

### System Info I have used the following [guide](https://medium.com/@joaopcmoura/lora-serving-on-amazon-sagemaker-serve-100s-of-fine-tuned-llms-for-the-price-of-1-85034ef889c5) to deploy lorax to sagemaker. I am able to do so successfully using the unquantized models. Have deployed OpenHermes 2.5 successfully....

Using `--gpus all` for docker run also requires `--sharded` or `--gpus N` to be set for LoRAX, but this isn't made clear. We should add something in the docs about...

documentation

### System Info predibase ### Information - [ ] Docker - [ ] The CLI directly ### Tasks - [ ] An officially supported command - [ ] My own...

### System Info I run your docker image in 2 cases: - single gpu (`--sharded false`) - multi-gpu (`--sharded false --num_shard 4`) => When I run single-gpu, the total time...

question

See https://flashinfer.ai/2024/01/08/cascade-inference.html

enhancement

### System Info I've run into 2 unexpected issues/inconsistencies when downloading adapters from S3. Issue 1: With `PREDIBASE_ADAPTERS_BUCKET=sagemaker-us-east-1-000000000000` Several prefixes with naming `lorax/mistral-adapters/{id}`, with id being an integer from 1...

bug

### System Info Latest Lorax version ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command - [ ] My own...

question

### Model description hi, my company has trained a model of 7b, we want to deploy lorax with our model. Can you introduce key steps to support model in loraX?...

question

Added a list of the exported metrics to the readme. Further info would be nice to add to the table - such as the metric type and the description.