text-generation-inference
text-generation-inference copied to clipboard
Enable qwen2vl video
This PR is a work in progress that explores adding support for video inputs with Qwen2-VL. Thank you @mfarre for getting this effort started.
TODOS
- [X] suport
video_urls - [X] fetch video contents in router
- [X] update protobufs to support video chunks
- [X] handle padding video token inputs
- [X] tokenize video bytes
- [X] integrate video logic with vision model (update position ids)
- [x] ensure tokenization process is correct
- [x] add tests
- [x] refactor/improve
update*
start server
text-generation-launcher \
--model-id Qwen/Qwen2-VL-7B-Instruct \
--max-batch-prefill-tokens 10000 \
--max-input-tokens 10000 \
--max-total-tokens 10001
send request
import requests
import json
def chat_completion(url="http://127.0.0.1:3000", video_url=None, prompt=None):
messages = [{
"role": "user",
"content": [
{
"type": "video_url",
"video_url": {
"url": video_url
}
},
{
"type": "text",
"text": prompt
}
]
}]
payload = {
"messages": messages,
"seed": 42,
"max_tokens": 30
}
response = requests.post(
f"{url}/v1/chat/completions",
json=payload,
headers={"Content-Type": "application/json"}
)
return response.json()
video_url = "https://test-videos.co.uk/vids/bigbuckbunny/mp4/h264/360/Big_Buck_Bunny_360_10s_1MB.mp4"
result = chat_completion(
video_url=video_url,
prompt="Describe this video."
)
print(json.dumps(result, indent=2))
# {
# "object": "chat.completion",
# "id": "",
# "created": 1731964042,
# "model": "Qwen/Qwen2-VL-7B-Instruct",
# "system_fingerprint": "2.4.1-dev0-native",
# "choices": [
# {
# "index": 0,
# "message": {
# "role": "assistant",
# "content": "The video showcases lush green trees with vibrant shades of green and various shades of yellow and brown, as well as moss-covered stumps and piles of moss",
# },
# "logprobs": null,
# "finish_reason": "length",
# }
# ],
# "usage": {"prompt_tokens": 9593, "completion_tokens": 30, "total_tokens": 9623},
# }
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
It still doesn't work @drbh