Ask-Anything icon indicating copy to clipboard operation
Ask-Anything copied to clipboard

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Results 78 Ask-Anything issues
Sort by recently updated
recently updated
newest added

![image](https://github.com/OpenGVLab/Ask-Anything/assets/53007066/02d792ca-a6fb-44c7-a10b-8593cca85b12) Hi, I cannot visit this page:https://github.com/OpenGVLab/Ask-Anything/blob/main/video_chat/video_chat2/MVBench.md Thank you for your support and fantastic work!

The old deepcoda link is dead. The official LLaMA2 HF repo is working: https://huggingface.co/meta-llama/Llama-2-13b-hf/tree/main?clone=true

error log Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/peft/peft_model.py", line 288, in __getattr__ return super().__getattr__(name) # defer to nn.Module's logic File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1265, in __getattr__ raise AttributeError("'{}' object has...

Hi, I followed the instruction to download all the weights for doing inference with VideoChat. However, I see the following errors: `Load VideoChat from: /home/ytang/workspace/modules/Ask-Anything/video_chat/model/videochat_7b.pth _IncompatibleKeys(missing_keys=['query_tokens', 'visual_encoder.cls_token', 'visual_encoder.pos_embed', 'visual_encoder.patch_embed.proj.weight', 'visual_encoder.patch_embed.proj.bias',...

Hello, could you please release the stage 3 checkpoint for zero-shot NextQA, which in the paper is obtained by performing stage 3 instruction tuning without the NextQA dataset?

Can I train with freeze_mhra=True? (in config_7b_stage1.py) In other words can I totally freeze the visual encoder and train a model that works? Thanks

无敌了。。。 :-(,无处不在,笑哭

Firstly, thanks for your interesting work. For minigpt4, can it be realized directly using video embedding? Just like, ```python query_tokens = self.query_tokens.expand(image_embeds.shape[0], -1, -1) query_output = self.Qformer.bert( query_embeds=query_tokens, encoder_hidden_states=image_embeds, encoder_attention_mask=image_atts,...

Hi there! First of all, let me say, this is cutting edge stuff, amazing Wanted to ask, how can we do this on live video? And what should the expected...

enhancement

Sadly can not get stablelm to work on 1070 w 8G vram and 36 gb vram. Sad to compile all on win to see it crash but hey. Here's a...

enhancement