VLMEvalKit
VLMEvalKit copied to clipboard
[Benchmark] Support Video MCQ with TaskMeAnything-v1-video-random as an example
Hi,
For this PR
- I added TaskMeAnything-v1-video-random video benchmark which includes 2700 video mcq questions.
- Along with the benchmark, I found that unlike image mcq, video datasets don’t have a video_mcq.py file, which might hard for adding other mcq video benchmark. Therefore, I implemented video_mcq.py following the logic of image_mcq.py.
The usage of video_mcq.py is the same as image_mcq.py: Just convert the benchmark to a TSV file, and encode the MP4 video to base64. I have provided the function named mp4_to_base64 in vlmeval/dataset/utils/video_mcq_utils.py.
I added the TaskMeAnything-v1-video-random video benchmark as an example for video_mcq.py and tested it on Paligemma (ImageQA model) and Video-LLaVA (VideoQA model), and it works well.