Video-LLaVA
Video-LLaVA copied to clipboard
Repository- transformers config missmatch
I get the following error:
AttributeError: 'LlavaConfig' object has no attribute 'mm_use_im_start_end'. Did you mean: 'mm_use_x_start_end'?
When running your newly updated repository. Note that the config.json file in transformers does not have 'mm_use_im_start_end': https://huggingface.co/LanguageBind/Video-LLaVA-7B/blob/main/config.json.
It is unclear if under your setup, you would like to use the video start/end tokens or use the video tokens.
Also note that in the same function, you are utilizing the now-depreciated DEFAULT_X_START_TOKEN
.
https://github.com/PKU-YuanGroup/Video-LLaVA/blob/e93f4927eaa926ed8450b481fde95c994ed23d2d/videollava/eval/video/run_inference_video_qa.py#L49-L53
Best, Orr
Sorry, we fixed that. Could you try it again? We do not use the video start/end tokens.
Yeah, I made the same edit on my local repo. When do you use the _act eval inference file? Also, I am getting a module import error (videollava) trying to resolve atm. Have you tried running instruction tuning with the current repository?
When eval activitynet dataset we use _act.py to eval here.
pip install -e .
to install videollava.
Yes, I have tested the training scripts.