Video-LLaMA
Video-LLaMA copied to clipboard
Is video-LLaMA capable of comprehending videos that have faces surrounded by bounding boxes(face recognition)
Is video-LLaMA capable of comprehending videos that have faces surrounded by bounding boxes(face recognition)?
If I asked video-LLaMA a question to descirbe what each person in a video us doing and to identify them by the names of their bounding box around their face, will it be able to do so?