CogVLM icon indicating copy to clipboard operation
CogVLM copied to clipboard

Inquiry on CogVLM's capacity for comprehension

Open PhilipAmadasun opened this issue 11 months ago • 2 comments

If provided with an image containing people with labelled bounding boxes around their faces (labelled with name, race, sex, and dominant facial emotion), can CogVLM coherently describe the people as well as information about each individual that matches the information provided by the label of each individuals bounding box. For example if someones bounding box is labeled (John, black, man, irritated), will CogVLM be able to provide information as such: "The image contains a man called John, he is black and his dominant emotion shows irritation. Is this something CogVLM can be prompted to do?

PhilipAmadasun avatar Mar 04 '24 08:03 PhilipAmadasun