llama-stack icon indicating copy to clipboard operation
llama-stack copied to clipboard

vllm does not work with image URLs

Open ashwinb opened this issue 10 months ago • 1 comments

System Info

...

Information

  • [ ] The official example scripts
  • [ ] My own modified scripts

🐛 Describe the bug

vLLM does not work when you just pass image URLs

see https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/vllm/vllm.py#L166

if you change download=False, it does not work.

How to test?

# run VLLM first 
docker run --rm -it -e HUGGING_FACE_HUB_TOKEN=... \
  -v /home/ashwin/.cache/huggingface:/root/.cache/huggingface \
  vllm/vllm-openai:latest \
  --trust-remote-code \
  --gpu-memory-utilization 0.75 \
   --model meta-llama/Llama-3.2-11B-Vision-Instruct --enforce-eager \
   --max-model-len 4096 --max-num-seqs 16 --port 6001
pytest -v -s -k vllm tests/inference/test_vision_inference.py \
  --env VLLM_URL=http://localhost:6001/v1

Error logs

On the vLLM logs, you see

INFO:     127.0.0.1:59652 - "POST /v1/chat/completions HTTP/1.1" 200 OK
ERROR 12-16 23:56:17 serving_chat.py:162] Error in loading multi-modal data
ERROR 12-16 23:56:17 serving_chat.py:162] Traceback (most recent call last):
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohttp/client.py", line 663, in _request
ERROR 12-16 23:56:17 serving_chat.py:162]     conn = await self._connector.connect(
ERROR 12-16 23:56:17 serving_chat.py:162]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohttp/connector.py", line 563, in connect
ERROR 12-16 23:56:17 serving_chat.py:162]     proto = await self._create_connection(req, traces, timeout)
ERROR 12-16 23:56:17 serving_chat.py:162]             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohttp/connector.py", line 1032, in _create_connection
ERROR 12-16 23:56:17 serving_chat.py:162]     _, proto = await self._create_direct_connection(req, traces, timeout)
ERROR 12-16 23:56:17 serving_chat.py:162]                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohttp/connector.py", line 1335, in _create_direct_connection
ERROR 12-16 23:56:17 serving_chat.py:162]     transp, proto = await self._wrap_create_connection(
ERROR 12-16 23:56:17 serving_chat.py:162]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohttp/connector.py", line 1091, in _wrap_create_connection
ERROR 12-16 23:56:17 serving_chat.py:162]     sock = await aiohappyeyeballs.start_connection(
ERROR 12-16 23:56:17 serving_chat.py:162]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohappyeyeballs/impl.py", line 89, in start_connection
ERROR 12-16 23:56:17 serving_chat.py:162]     sock, _, _ = await _staggered.staggered_race(
ERROR 12-16 23:56:17 serving_chat.py:162]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohappyeyeballs/_staggered.py", line 160, in staggered_race
ERROR 12-16 23:56:17 serving_chat.py:162]     done = await _wait_one(
ERROR 12-16 23:56:17 serving_chat.py:162]            ^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohappyeyeballs/_staggered.py", line 41, in _wait_one
ERROR 12-16 23:56:17 serving_chat.py:162]     return await wait_next
ERROR 12-16 23:56:17 serving_chat.py:162]            ^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162] asyncio.exceptions.CancelledError
ERROR 12-16 23:56:17 serving_chat.py:162]
ERROR 12-16 23:56:17 serving_chat.py:162] The above exception was the direct cause of the following exception:
ERROR 12-16 23:56:17 serving_chat.py:162]
ERROR 12-16 23:56:17 serving_chat.py:162] Traceback (most recent call last):
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/serving_chat.py", line 160, in create_chat_completion
ERROR 12-16 23:56:17 serving_chat.py:162]     mm_data = await mm_data_future
ERROR 12-16 23:56:17 serving_chat.py:162]               ^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/chat_utils.py", line 235, in all_mm_data
ERROR 12-16 23:56:17 serving_chat.py:162]     items = await asyncio.gather(*self._items)
ERROR 12-16 23:56:17 serving_chat.py:162]             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/vllm/multimodal/utils.py", line 140, in async_get_and_parse_image
ERROR 12-16 23:56:17 serving_chat.py:162]     image = await async_fetch_image(image_url)
ERROR 12-16 23:56:17 serving_chat.py:162]             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/vllm/multimodal/utils.py", line 62, in async_fetch_image
ERROR 12-16 23:56:17 serving_chat.py:162]     image_raw = await global_http_connection.async_get_bytes(
ERROR 12-16 23:56:17 serving_chat.py:162]                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/vllm/connections.py", line 92, in async_get_bytes
ERROR 12-16 23:56:17 serving_chat.py:162]     async with await self.get_async_response(url, timeout=timeout) as r:
ERROR 12-16 23:56:17 serving_chat.py:162]                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohttp/client.py", line 1359, in __aenter__
ERROR 12-16 23:56:17 serving_chat.py:162]     self._resp: _RetType = await self._coro
ERROR 12-16 23:56:17 serving_chat.py:162]                            ^^^^^^^^^^^^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohttp/client.py", line 579, in _request
ERROR 12-16 23:56:17 serving_chat.py:162]     with timer:
ERROR 12-16 23:56:17 serving_chat.py:162]          ^^^^^
ERROR 12-16 23:56:17 serving_chat.py:162]   File "/usr/local/lib/python3.12/dist-packages/aiohttp/helpers.py", line 749, in __exit__
ERROR 12-16 23:56:17 serving_chat.py:162]     raise asyncio.TimeoutError from exc_val

Expected behavior

Should have worked as per the documentation of vLLM? I also tried setting VLLM_IMAGE_FETCH_TIMEOUT=20 when starting the vLLM server.

ashwinb avatar Dec 17 '24 08:12 ashwinb