aibrix icon indicating copy to clipboard operation
aibrix copied to clipboard

NIXL error: Potentially invalid KV blocks for unrecognized request

Open dczhu opened this issue 1 month ago • 0 comments

🐛 Describe the bug

During benchmark, prefillers complain: ERROR 11-19 10:12:21 [distributed/.../v1/nixl_connector.py:1068] Potentially invalid KV blocks for unrecognized request chatcmpl-129a69c8-cd1a-40e0-aa3e-94662fd8cc2b were retrieved by a decode worker. They may have expired.

The engine continues to work after the error report.

Steps to Reproduce

  1. Deploy vllm 2p2d tp2 for Qwen3-32B.
  2. Allocate 1 RDMA nic per pod.
  3. Schedule prefill pods and decode pods onto separate nodes.
  4. Conduct benchmark:
TOKENIZER="/data01/models/Qwen3-32B"
MODEL="qwen3-32b"
HOST="${LB_EXTERNAL_IP}"
PORT="80"
REQ_RATE=1.0
INPUT_LEN=8000
OUTPUT_LEN=200

python3 benchmark_serving.py --port $PORT --host $HOST  --seed $(date +%s) \
      --model $MODEL  \
      --tokenizer $TOKENIZER \
      --dataset-name random --random-input-len ${INPUT_LEN} --random-output-len ${OUTPUT_LEN} \
      --num-prompts 200 --burstiness 100 --request-rate ${REQ_RATE} --metric-percentiles 95 \
      --backend openai-chat --endpoint /v1/chat/completions --routing-strategy "pd" --ignore-eos

Expected behavior

No error reported during benchmarking.

Environment

  • AIBrix 0.5.0
  • VKE

dczhu avatar Nov 19 '25 21:11 dczhu