VideoGPT-plus
VideoGPT-plus copied to clipboard
Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
How to perform zero-shot QA evaluation on datasets like MSVD-QA, MSRVTT-QA, TGIF-QA, ActivityNet-QA? Could we just follow the pipeline of Video-ChatGPT?
step1 pretrain_projector_image_encoder.sh step2 pretrain_projector_video_encoder.sh step3 finetune_dual_encoder.sh step4 eval/vcgbench/inference/run_ddp_inference.sh step5 eval/vcgbench/gpt_evaluation/vcgbench_evaluate.sh ``` #!/bin/sh export DATASET_DIR=/mnt2/ninghuayang/data/videogpt_plus_dataset BASE_LLM_PATH=microsoft/Phi-3-mini-4k-instruct VISION_TOWER=OpenGVLab/InternVideo2-Stage2_1B-224p-f4 IMAGE_VISION_TOWER=openai/clip-vit-large-patch14-336 PROJECTOR_TYPE=mlp2x_gelu #PRETRAIN_VIDEO_MLP_PATH=MBZUAI/VideoGPT-plus_Phi3-mini-4k_Pretrain/mlp2x_gelu_internvideo2/mm_projector.bin #PRETRAIN_IMAGE_MLP_PATH=MBZUAI/VideoGPT-plus_Phi3-mini-4k_Pretrain/mlp2x_gelu_clip_l14_336px/mm_projector.bin PRETRAIN_VIDEO_MLP_PATH=results/mlp2x_gelu_internvideo2/mm_projector.bin PRETRAIN_IMAGE_MLP_PATH=results/mlp2x_gelu_clip_l14_336px/mm_projector.bin OUTPUT_DIR_PATH=results/videogpt_plus_finetune deepspeed videogpt_plus/train/train.py \ --lora_enable True --lora_r 128...
Hello everyone, I have been working on replicating benchmarks related to video-class Large Language Models (LLMs), and I've noticed that most of these benchmarks rely on the GPT-assistant framework. Given...
Hello, I have a question regarding the conversation capabilities of this project: 1. Does the system support multi-turn conversations? 2. Is it possible to have a natural, ongoing dialogue while...
Hey! Thanks for your great work. Do u have any plan to provide a simple demo, i.e., input a video and a question, not a benchmark?
Thank you so much for sharing this amazing work! I’m wondering where I can find the dense captions for the 112k videos mentioned in the paper.
Hello there, Thank you for your remarkable work and I am really interested in looking into it. The whole installation process works smoothly until the very last command. The “python...
Wrote up code for a simple demo for VideoGPT+ inference on a sample video
when run the script, met the problem: ImportError: cannot import name 'Phi3Model' from 'transformers'
Hi, I am getting this error while train the model - You are using a model of type phi3 to instantiate a model of type VideoGPT+. This is not supported...