lorax issues

Fail to run server with prefix-caching option

1

### System Info - ghcr.io/predibase/lorax:a8ca5cb - Ubuntu 20.04 - GPU A10G ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command...

prd-tuong-nguyen

The server is failing to run

### System Info I am using docker image and it is 2 days old ghcr.io/predibase/lorax:main This is my host nvidia info : +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 550.54.14 Driver Version: 550.54.14 CUDA...

u650080

fix cve

# What does this PR do? Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks...

noah-yoshida

Potential race condition

1

### System Info We are using lorax 0.12.1 and we notice that sometimes different requests of different adapters could affect each other. We noticed that sometimes one request's input could...

xingjianan

fix: Support kernels with shape 29568

jeffkinnison

Bump torch from 2.4.0 to 2.6.0 in /server

Bumps [torch](https://github.com/pytorch/pytorch) from 2.4.0 to 2.6.0. Release notes Sourced from torch's releases. PyTorch 2.6.0 Release Highlights Tracked Regressions Backwards Incompatible Change Deprecations New Features Improvements Bug fixes Performance Documentation Developers...

dependabot[bot]

dependencies

python

Kernel issues, with docker image:main

### System Info spec - aws g6e.12xLarge Hi, I'm trying out lorax. I ran a docker container with image tag as main(ghcr.io/predibase/lorax:main) and was facing some kernel issues. Attaching logs....

gane5hvarma

return_k_alternatives param causing validation error

1

### System Info When I use return_k_alternatives with generation config ``` generation_config = { "max_new_tokens": 64, "temperature": 1, "do_sample": True, "top_p": 0.8, "top_k": 10, "return_k_alternatives": 5 } ``` It cause...

mrcchef

Cannot run a FP8 quantized model with LoraX

6

### System Info **Lorax version:** Name: lorax-client Version: 0.6.3 Summary: LoRAX Python Client Home-page: https://github.com/predibase/lorax Author: Travis Addair Author-email: [email protected] License: Apache-2.0 Location: /mnt/share/ai_studio/.venv/lib/python3.11/site-packages Requires: aiohttp, certifi, huggingface-hub, pydantic Required-by:...

Aktsvigun

Deprecate max_tokens in favor of max_completion_tokens

OpenAI has already deprecated max_tokens for [max_completion_tokens](https://platform.openai.com/docs/api-reference/chat/create#chat-create-max_completion_tokens)

Infernaught

lorax
lorax copied to clipboard

Metadata

Fail to run server with prefix-caching option

The server is failing to run

fix cve

Potential race condition

fix: Support kernels with shape 29568

Bump torch from 2.4.0 to 2.6.0 in /server

Kernel issues, with docker image:main

return_k_alternatives param causing validation error

Cannot run a FP8 quantized model with LoraX

Deprecate max_tokens in favor of max_completion_tokens

← Metadata

Owner

Metadata

lorax lorax copied to clipboard

Metadata

← Metadata

Owner

Metadata

lorax
lorax copied to clipboard