Ask-Anything
Ask-Anything copied to clipboard
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
Hello! Thanks for your excellent work on the VideoChat2 model and code sharing. I have a few questions regarding the "stage3" training phase of the model and hope you could...
我把所有视频放在列表里面去遍历,依次生成caption. 但十几个视频之后程序就会卡在输出caption的answer那里。请问这是为什么呢?非常感谢!
According to the code, it seems that causal masking is applied also in the visual queries in Stage 2 and 3. Is there any reason for this implementation?
Hello, I would like to ask if the score of each token is returned in stage 3, such as the code of Video LLAVA. 
Thanks for your excellent work. I am curious if there are any instructions for fine-tuning video-llava on my own dataset?
这个我stage3配置信息 ``` from configs.instruction_data import * # ========================= data ========================== train_corpus = "videochat2_instruction" train_file = "${available_corpus[${train_corpus}]}" # for lazy evaluation # import pdb;pdb.set_trace() # train_file = available_corpus[train_corpus] test_file = dict()...
Your tables [here](https://github.com/OpenGVLab/Ask-Anything/blob/main/video_chat2/README.md#parrot-videochat2) explains very well how to fine-tune the model step by step. You also have some of the checkpoints along the way. But I cannot find the final...
Hello! First of all, thank you for your great work on the videochat2 model. I have a question about the training part in stage3, particularly in line 274 of the...
Hi, I have tested the VideoChat2 model on my server and found that the test results are different from the paper. My results are listed as follows: {"Action Sequence": 66.0,...