Qwen-VL icon indicating copy to clipboard operation
Qwen-VL copied to clipboard

Extracting Unimodal Features

Open sreebhattacharyya opened this issue 5 months ago • 1 comments

Hello! I am trying to use Qwen-VL to extract unimodal features for a given input image and accompanying text query. How can that be achieved? I am aware that models like BLIP-2 have a direct API (extract_features) that aids in doing this. But how can it be achieved in the context of Qwen-VL?

sreebhattacharyya avatar Sep 26 '24 03:09 sreebhattacharyya