Ask-Anything
Ask-Anything copied to clipboard
Can you provide scripts for generating question & candidates using ChatGPT?
Fine-grained Pose (NTU RGBD), Scene Transition (MovieNet), Unexpected action (FunQA), Egocentric Navigation (VLN-CE) Datasets of these tasks don't have QA annotations and it seems that you generate annotations by yourself with aid of ChatGPT. Could you provide related scripts you used for this?
Good question!
- Fine-grained Pose: We manually combine similar poses and randomly generate the candidates, like
drop(5), pick up(6), sit down(8), stand up(9), hopping(26), jump up(27), squat down(80)
. - Scene Transition: We generate the candidates based on the scene annotation, then use ChatGPT to generate the candidates like
Based on 'From the courtroom to the prison.', please create three similar sentences with different places
. - Unexpected action: We use similar prompts as in the original paper to generate the option as follows
"You are now an assistant for data augmentation. You have extensive experience in video understanding and have mastered this skill. I will provide you with a 'question' and 'answer' regarding a counter-intuitive video.\n" + \
"Your task is to help me understand the content of this paragraph and generate one English question-answer pair from it. The generated question should be closely related to the provided answer.\n" + \
"The format will be multiple choice, where each question has four options - one correct answer and three distractors.\n" + \
f'Question: "{question.strip()}"\n' + \
f'Answer: "{answer.strip()}"\n' + \
"To avoid cheating, the lengths of the correct answer and other distractors MUST be similar.\n" + \
"You need to ONLY return the generated QA in JSON like {'question': '', 'options': [], 'answer': ''}"
We will check the option's length. If the length difference is large, we will regenerate it.
- Egocentric Navigation: We randomly generate the candidates from
move forward, stop, turn left and move forward, turn right and move forward
.
@Andy1621 I can't find question annotations, as well as options you mentioned. I navigated the datasets, but the question annotations were not in the original datasets. At least It seems that Fine-grained Pose has limited set of questions (like, Which one of these descriptions correctly matches the actions in the video?) but the others (MovieNet, VLN-CE) don't. Also, there's no annotation for NTU RGBD at all. Did you annotate the data with watching videos by yourself?
For the questions, we generate it by ChatGPT~
You can check our appendix for more details, like
@Andy621 Could you provide the prompt for MovieNet and VLN-CE?
I don't save the specific prompt... I remember that I required ChatGPT to generate some basic questions.
For example in scene_transition,
"video": "Top006_08310.mp4",
"question": "Which choice matches the scene changes in the video?",
"candidates": [
"From the kitchen to the dining room.",
"From the staircase to the gangway.",
"From the bedroom to the bathroom.",
"From the classroom to the library."
],
"answer": "From the staircase to the gangway."
I wonder where the answer "From the staircase to the gangway" came from. There's no related data in the original MovieNet dataset
In addition, It seems that MovieNet currently does not provide video data due to copyright issue. Where did you get videos like Top006_08310.mp4?
Good question! It's a wrong cite when preparing the paper. We actually use the videos in MoVQA, we will fix it later.
I see. It seems the dataset hasn't released yet. Is there any plan to release it? Or could you provide me with the dataset?
Since I'm not the author, you can email the authors for more details.
In action sequence task of STAR dataset, the paper states that it directly adopts QA of original dataset. However, annotation is quite different from the original one. In the annotation of MVbench, more than half questions start with "What happened after ~?" Meanwhile, I can't find corresponding questions which start with such sentence in STAR dataset annotation. Can you clarify this discrepancy?
Please check those Sequence
data in STAR
. We do not use those QA contains to the
which is about object.