VideoGPT-plus icon indicating copy to clipboard operation
VideoGPT-plus copied to clipboard

Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding

Results 17 VideoGPT-plus issues
Sort by recently updated
recently updated
newest added

Hi team, Nice work! Can I request the intermediate descriptions for vcg-plus_112k generated by [this file](https://github.com/mbzuai-oryx/VideoGPT-plus/blob/main/annotation_pipeline/3_dense_video_description.py)? Thanks in advance!

enhancement

Hello, Thank you for sharing your excellent research and code. I am currently pretraining an image encoder using 8 A100 GPUs. The estimated time of arrival (ETA) is about 6...

Do you have a plan to release the original "Detailed Video Descriptions"?

[h264 @ 0x16543c00] Missing reference picture, default is 65562 [h264 @ 0x16543c00] mmco: unref short failure [h264 @ 0x16543c00] mmco: unref short failure [h264 @ 0x16543c00] Missing reference picture, default...

raise DECORDError(err_str) decord._ffi.base.DECORDError: [05:19:05] /github/workspace/src/video/ffmpeg/threaded_decoder.cc:292: [05:19:05] /github/workspace/src/video/ffmpeg/threaded_decoder.cc:218: Check failed: avcodec_send_packet(dec_ctx_.get(), pkt.get()) >= 0 (-11 vs. 0) Thread worker: Error sending packet.

Hi, thanks for your awesome work! I want to know why training two models for 2(VGG and MV)benchmarks? Why not use all the data to train a single model. Looking...

I have a question about the construction of the dataset. Does the keyframe extraction in the paper take only one frame per scene after it passes scene detection?