lorax issues

Important: In latest main, the server can not serve more than 1 user

3

### System Info Meet this error when more than 1 user request to server (I try to run previous image version and it still work fine) ``` ID not found...

prd-tuong-nguyen

Adding longrope for serve Phi-3

# What does this PR do? Fixes #550 Reference: https://github.com/huggingface/text-generation-inference/pull/2179/files ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other...

huytuong010101

Cannot start Phi-3-mini-128k-instruct from Docker

4

### System Info Using two official Docker images (latest and main). ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command...

annadmitrieva

Updated the documentaion about status code

1

# What does this PR do? Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks...

Jatintalreja0510

otel fixups

Trying to figure out why we can't see the traces from requests being used internally

noah-yoshida

Add Support for AutoModelForSequenceClassification Models

1

### Feature request LoRAX currently only supports text generation models (e.g., causal language models). It lacks support for sequence classification models like `AutoModelForSequenceClassification`. This issue proposes adding support for `AutoModelForSequenceClassification`...

akkky02

Fix `--compile`

Using `--compile` option on the main branch is currently broken. I've fixed the first issue, but this just leads to the next issue, which we need to debug, and changes...

ajtejankar

feat: Function calling with output schema enforcement

6

Introduces function-calling capabilities with JSON schema enforcement on output. Example using `Mistral-7B-Instruct-v0.3`: ``` curl 127.0.0.1:8080/generate -X POST -d '{ "inputs": "WHat is the current temperature of New York, San Francisco...

jeffreyftang

Passing a `--revision` causes failure in loading tokenizer config

### System Info ```shell docker run ghcr.io/predibase/lorax:ea5d74b --gpus all -it --rm -e HF_TOKEN=... lorax-launcher --source hub --model-id meta-llama/Llama-2-7b-chat-hf --default-adapter-source local --revision f5db02db724555f92da89c216ac04704f23d4590 ``` We notice ``` 2024-08-01T13:25:45.718148Z INFO hf_hub: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55:...

chiragjn

LORAX_USE_GLOBAL_HF_TOKEN to be applied correctly even though request doesn't have api_token

# What does this PR do? Fixes #541 (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks...

monologg

lorax
lorax copied to clipboard

Metadata

Important: In latest main, the server can not serve more than 1 user

Adding longrope for serve Phi-3

Cannot start Phi-3-mini-128k-instruct from Docker

Updated the documentaion about status code

otel fixups

Add Support for AutoModelForSequenceClassification Models

Fix `--compile`

feat: Function calling with output schema enforcement

Passing a `--revision` causes failure in loading tokenizer config

LORAX_USE_GLOBAL_HF_TOKEN to be applied correctly even though request doesn't have api_token

← Metadata

Owner

Metadata

lorax lorax copied to clipboard

Metadata

← Metadata

Owner

Metadata

lorax
lorax copied to clipboard