Ask-Anything icon indicating copy to clipboard operation
Ask-Anything copied to clipboard

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Results 78 Ask-Anything issues
Sort by recently updated
recently updated
newest added

我们复现的模型性能与发布模型性能相差4-5pp(mvbench),考虑是否有flash_attn版本不一致的原因。 发布版本的flash_attn==1.0.4,我们机器安装flash_attn==1.0.4报错,但是可以顺利安装flash_attn==2.4.2。由于flash_attn==2.4.2对于flash_attn==1.0.4是完全重构,想了解一下flash_attn升级版本是否对模型性能产生影响,贵团队是否利用flash_attn==2.4.2训练并测试过模型性能。

作者好~ 请问video chat2的online demo是关闭了吗?还会再开放使用吗

作者好, 通过对videochat2中第三阶段的训练(训练集数据均来源于论文中提供的数据集),通过训练后模型对视频进行描述,会出现大量重复语句。 提问内容为:"Describe the following video clip in detail." 回答如:"The video clip shows a woman wearing a black shirt and black pants standing in a dark room. She is holding...

Hi, dear all: I noticed a sequential for loop in `videochat2_it.py` forward function: https://github.com/OpenGVLab/Ask-Anything/blob/078540aaebfbe1ad9a109020a73b0ce173b355ef/video_chat2/models/videochat2_it.py#L240-L288 As most of tensor computation is parallel, any good idea to implement the for loop parallel?

I got following error message when I input a 2-mins long video with the default hyperparameter setting (beam search numbers = 1, temperature = 1, video segments = 8) and...

作者好~ 在VideoChat2的训练中, 第二阶段训练中,会对Visual Encode和QFormer进行参数训练,导致参数发生变化。 那么在第三阶段训练中,输入的vit_blip_model,是来自于第二阶段参数发生变化的模型,还是重新使用原始的vit_blip_model?

Hi, I'm currently attempting to run the video_chat2 model on a multi-GPU setup consisting of 8 Nvidia Titan Xp GPUs, each with 12GiB of memory. I'm using the mvbench.ipynb notebook...

I am planning to fine-tune the VideoChat2 model with custom instruction data to enhance its performance on downstream tasks. I have a couple of questions regarding the pre-training data and...

Excuse me. In the DATA.MD, there is the adress to download the video files. But do not tell us how to treat video datasets differently. Or can you provide me...