worker-vllm issues

Fixed MODEL_REVISION environment variable

It seems there was a typo in the Dockerfile, preventing `$MODEL_REVISON` from ever being set since the build-arg was called `$MODEL_REVISON` but the ENV instruction tried to read a non-existing...

mikljohansson

How can i update to vLLM v0.4.1 for llama3 support ?

5

Hello everyone, I would like to update the vLLM version to v0.4.1 in order to get access to LLAMA3 but i don't know how modify the fork runpod/vllm-fork-for-sls-worker. Could you...

Lhemamou

vLLM 0.4.0, 30% base image size reduction

2

alpayariyak

Errors cause the instance to run indefinitely

23

Any errors caused by the payload cause the instance to hang in an error state indefinitely. You have to manually terminate the instance or you'll rack up a hefty bill...

gabewillen

Mr-Nobody1

chore: update `REVISION` to `MODEL_REVISION` in dockerfile

Thank you for this awesome repo 💯 . While building a custom revision, I noticed this typo. An incorrect model (main) gets downloaded in `download_model.py`

joennlae

OpenAI API: API errors have wrong HTTP code

3

Using a model that does not exist returns HTTP status 200, but the error message is in the JSON

lucasavila00

Multi-LoRA

Any update on when this feature will be available? Thanks

joaomsimoes

worker-vllm
worker-vllm copied to clipboard

Metadata

Fixed MODEL_REVISION environment variable

How can i update to vLLM v0.4.1 for llama3 support ?

vLLM 0.4.0, 30% base image size reduction

Errors cause the instance to run indefinitely

BadRequestError on runsync route, or what is the correct method to hit handler.py's locally run API?

Best way to record data

Cannot load Tokenizers for some Models.

chore: update `REVISION` to `MODEL_REVISION` in dockerfile

OpenAI API: API errors have wrong HTTP code

Multi-LoRA

← Metadata

Owner

Metadata

worker-vllm worker-vllm copied to clipboard

Metadata

← Metadata

Owner

Metadata

worker-vllm
worker-vllm copied to clipboard