aibrix
aibrix copied to clipboard
port in annotation is not picked by gateway pod
🚀 Feature Description and Motivation
labels:
model.aibrix.ai/name: qwen3-8B
model.aibrix.ai/port: "30000"
model.aibrix.ai/engine: sglang
spec:
nodeSelector:
kubernetes.io/hostname: 192.168.0.6
containers:
- name: decode
image: kvcache-container-image-hb2-cn-beijing.cr.volces.com/aibrix/sglang:v0.4.9.post3-cu126-nixl-v0.4.1
command: ["sh", "-c"]
args:
- |
python3 -m sglang.launch_server \
--model-path /models/Qwen3-8B \
--served-model-name qwen3-8B \
--host 0.0.0.0 \
--port 30000 \
--disaggregation-mode decode \
--disaggregation-transfer-backend=mooncake \
--trust-remote-code \
--mem-fraction-static 0.8 \
--log-level debug
curl -v http://${ENDPOINT}/v1/chat/completions \
-H "routing-strategy: pd" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-8B",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "help me write a random generator in python"}
],
"temperature": 0.7
}'
* Host localhost:8888 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
* Trying [::1]:8888...
* Connected to localhost (::1) port 8888
> POST /v1/chat/completions HTTP/1.1
> Host: localhost:8888
> User-Agent: curl/8.7.1
> Accept: */*
> routing-strategy: pd
> Content-Type: application/json
> Content-Length: 232
>
* upload completely sent off: 232 bytes
< HTTP/1.1 400 Bad Request
< x-error-no-model-backends: qwen3-8Bxxx
< content-type:
< content-length: 67
< date: Mon, 20 Oct 2025 05:38:27 GMT
< connection: close
<
* Closing connection
{"error":{"code":400,"message":"model qwen3-8B does not exist"}}%
Use Case
routing
Proposed Solution
No response
Do you have the yaml by chance? Because this is not an issue.