Ask-Anything Could you provide evaluation codes for NExT-QA, STAR and TVQA on video

Could you provide evaluation codes for NExT-QA, STAR and TVQA on video_chat2?

Open CUCldyyyyy opened this issue 1 year ago • 10 comments

Hey,your work is really impressive! Could you provide evaluation codes for NExT-QA, STAR and TVQA ,it seems some changes must be made on original mvbench.ipynb file. I'd be appreciate if you could.Thanks again!

Dec 22 '23 09:12 CUCldyyyyy

For NExT-QA, STAR and TVQA, we simply change the code as in mvbench.ipynb. You need to prepare the corresponding dataset and use the same testing prompt.

Dec 24 '23 13:12 Andy1621

You can follow SeViLA to prepare the dataset, and change the code to load the JSON.

Dec 24 '23 13:12 Andy1621

I see，thanks for your reply.

Dec 26 '23 14:12 CUCldyyyyy

hey!A problem occured to my inference code,could u tell me the possible reason?

Question: why does the owl fly back to the man in green and land on the arm of the lady in white? Options: (A) defend itself. (B) green man instructed owl. (C) greeting lady. (D) escape man. (E) find food. Only give the best option. ###Assistant: Best option:( (bolds君 — †</s> GT: (B) green man instructed owl. Part Acc: 0.00% Total Acc: 0.00%

Dec 28 '23 03:12 CUCldyyyyy

messy code is given in Best option,error is :IndexError: piece id is out of range.

Dec 28 '23 03:12 CUCldyyyyy

You might use the wrong version of Vicuna-v0, please check https://github.com/OpenGVLab/Ask-Anything/issues/81

Jan 02 '24 07:01 Andy1621

Thank you for your response! I have resolved the issue. Additionally, I noticed that the suffixes in the fine-tuning weights for the three released stages include '7b'. When I directly use 'vicuna-13b-v0', I encounter a dimension mismatch error during the model weight loading (4096 vs. 5120). How can this be resolved? Do I need to modify the source code dimensions, or is it due to the absence of a 13b version in the currently released fine-tuning weights?

Jan 02 '24 10:01 CUCldyyyyy

Yes! The model needs to be retrained with a new LLM. Currently, we do not release the 13B model for its marginal improvement~

Jan 03 '24 03:01 Andy1621

Hey！I need to run inference on MSVD-QA,wondering how to change the original mvbench.ipynb file to fit the open-ended dataset? It seems designed for multi-choice task and the prompt need to be revised for op task. Could u provide the method to reproduce the result on MSVD-QA?Thanks a lot!

Jan 11 '24 13:01 CUCldyyyyy

Please check the code in Video_ChatGPT.

Jan 12 '24 15:01 Andy1621

Ask-Anything Ask-Anything copied to clipboard

Could you provide evaluation codes for NExT-QA, STAR and TVQA on video_chat2?

Ask-Anything
Ask-Anything copied to clipboard