Jiaxin Shan comments

Results 742 comments of


                                            Jiaxin Shan

Gateway returns not meaningful response when pod is running but container not ready

![Image](https://github.com/user-attachments/assets/7cc8dc92-01d2-4f74-b78c-543cf3bfb2bb) I think the problem is probably due to the env miss https://github.com/vllm-project/aibrix/pull/776 this change, and it forward request to worker pod. Note, worker use different probe from head. After...

Dynamic Multi LoRA Load \ Delete Support

@simon-mo @Yard1 Do you know the status of this PR? We are building control planes for LoRa and need this change. I'm unsure if the original author is still working...

Increase gateway request timeout from 120s to 1800s

1800 seems too high. Let's hold this change and please run more benchmarks so we can tune this number later

StreamLoader can persist models to disk in bypass

Due to limited time, I will untrack this issue from v0.2.0.

Support DeepSeek/DeepSeek-OCR

## use logits_processors in request extra_body ``` def infer(img_base64): response = client.chat.completions.create( model=model, messages=[ { "role": "user", "content": [ { "type": "image_url", # "image_url": {"url": f"data:image/png;base64,{img_base64}"}, "image_url": {"url": "https://jeroen.github.io/images/testocr.png"}, },...

Support DeepSeek/DeepSeek-OCR

## specify logits_processors in engine startup command

Support DeepSeek/DeepSeek-OCR

### work version 1: ``` vllm serve deepseek-ai/DeepSeek-OCR ``` ``` from openai import OpenAI import base64 import os os.environ["OPENAI_API_KEY"] = "" client = OpenAI(base_url="http://localhost:8000/v1") model = "deepseek-ai/DeepSeek-OCR" def encode_image(image_path): with...