lorax icon indicating copy to clipboard operation
lorax copied to clipboard

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Results 185 lorax issues
Sort by recently updated
recently updated
newest added

### System Info - ghcr.io/predibase/lorax:a8ca5cb - Ubuntu 20.04 - GPU A10G ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command...

### System Info I am using docker image and it is 2 days old ghcr.io/predibase/lorax:main This is my host nvidia info : +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 550.54.14 Driver Version: 550.54.14 CUDA...

# What does this PR do? Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks...

### System Info We are using lorax 0.12.1 and we notice that sometimes different requests of different adapters could affect each other. We noticed that sometimes one request's input could...

Bumps [torch](https://github.com/pytorch/pytorch) from 2.4.0 to 2.6.0. Release notes Sourced from torch's releases. PyTorch 2.6.0 Release Highlights Tracked Regressions Backwards Incompatible Change Deprecations New Features Improvements Bug fixes Performance Documentation Developers...

dependencies
python

### System Info spec - aws g6e.12xLarge Hi, I'm trying out lorax. I ran a docker container with image tag as main(ghcr.io/predibase/lorax:main) and was facing some kernel issues. Attaching logs....

### System Info When I use return_k_alternatives with generation config ``` generation_config = { "max_new_tokens": 64, "temperature": 1, "do_sample": True, "top_p": 0.8, "top_k": 10, "return_k_alternatives": 5 } ``` It cause...

### System Info **Lorax version:** Name: lorax-client Version: 0.6.3 Summary: LoRAX Python Client Home-page: https://github.com/predibase/lorax Author: Travis Addair Author-email: [email protected] License: Apache-2.0 Location: /mnt/share/ai_studio/.venv/lib/python3.11/site-packages Requires: aiohttp, certifi, huggingface-hub, pydantic Required-by:...

OpenAI has already deprecated max_tokens for [max_completion_tokens](https://platform.openai.com/docs/api-reference/chat/create#chat-create-max_completion_tokens)