MM-VID
MM-VID copied to clipboard
Open source implementation of the paper "MM-Vid: Advancing Video Understanding with GPT-4V(ision)".
MM-Vid: Advancing Video Understanding with GPT-4V(ision)
This repository contains the open source implementation of the paper "MM-Vid: Advancing Video Understanding with GPT-4V(ision)".
Overview
The goal of this project is to advance video understanding by leveraging the capabilities of GPT-4V(ision). The implementation follows the methodologies and experiments described in the paper, providing a comprehensive framework for scene detection, video clipping, speech recognition, and generating coherent video descriptions.
Installation
To use this repository, first clone the repository and install the required dependencies.
git clone https://github.com/yongliang-wu/MM-VID.git
cd MM-VID
pip install -r requirements.txt
Then run the code
python main.py
TODO
The input of external information is not supported yet.