Support HTTPRoute for StormService
🚀 Feature Description and Motivation
curl http://localhost:8888/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-r1",
"stream": false,
"messages": [
{"role": "user", "content": "1111"}
]
}'
gateway logs
if routing strategy is specified, gateway throws error
error on track request load consumption: deployment name not found on pod xllm-xpyd-roleset-j9f6r-decode-797dd47f68-0
Use Case
Support StormService Routing
Proposed Solution
No response
I feel the key is still https://github.com/vllm-project/aibrix/issues/1273, we need to naively support this orchestration type
/cc @happyandslow
/cc @omerap12 here's a related issue. HTTPRoute is not supported for stormservice yet
/cc @omerap12 here's a related issue. HTTPRoute is not supported for stormservice yet
Thanks for pointing this out
@Jeffwan I would like to ask why it is designed this way, why deploying a large model through deployment automatically creates an httprouter, while through StormService and RayClusterFleet, an httprouter is not automatically created? Is there a plan to create one in the future? Thank you very much.
/cc @omerap12 here's a related issue. HTTPRoute is not supported for stormservice yet
@xiaolin8 this is not by design, we use aibrix router a lot and doesn't rely on httproute. Yes. I remember there's an issue to track it. If that's an urgent issue to you, feel free to let me know and we can promote the feature
@xiaolin8 this is not by design, we use aibrix router a lot and doesn't rely on httproute. Yes. I remember there's an issue to track it. If that's an urgent issue to you, feel free to let me know and we can promote the feature
Yes, this is very important because if there is no httproute, it will completely rely on the routing-strategy header, which does not comply with the OpenAI interface specifications and will cause many llm clients like ChatBox to be unusable. Thank you very much.
@xiaolin8 technically, applicaiton can append the default headers. I got your point. in future release, we will make it as default option to get the router benefits without additional changes, in this case, your application doesn't need 1 line change
client = OpenAI(base_url="http://${ENDPOINT}/v1", api_key="OPENAI_API_KEY",
default_headers={'routing-strategy': 'least-request'})