Ask-Anything Can you provide scripts for generating question & candidates using ChatGPT?

Fine-grained Pose (NTU RGBD), Scene Transition (MovieNet), Unexpected action (FunQA), Egocentric Navigation (VLN-CE) Datasets of these tasks don't have QA annotations and it seems that you generate annotations by yourself with aid of ChatGPT. Could you provide related scripts you used for this?

Feb 16 '24 07:02 cmh1027

Good question!

Fine-grained Pose: We manually combine similar poses and randomly generate the candidates, like drop(5), pick up(6), sit down(8), stand up(9), hopping(26), jump up(27), squat down(80).
Scene Transition: We generate the candidates based on the scene annotation, then use ChatGPT to generate the candidates like Based on 'From the courtroom to the prison.', please create three similar sentences with different places.
Unexpected action: We use similar prompts as in the original paper to generate the option as follows

"You are now an assistant for data augmentation. You have extensive experience in video understanding and have mastered this skill. I will provide you with a 'question' and 'answer' regarding a counter-intuitive video.\n" + \
"Your task is to help me understand the content of this paragraph and generate one English question-answer pair from it. The generated question should be closely related to the provided answer.\n" + \
"The format will be multiple choice, where each question has four options - one correct answer and three distractors.\n" + \
f'Question: "{question.strip()}"\n' + \
f'Answer: "{answer.strip()}"\n' + \
"To avoid cheating, the lengths of the correct answer and other distractors MUST be similar.\n" + \
"You need to ONLY return the generated QA in JSON like {'question': '', 'options': [], 'answer': ''}"

We will check the option's length. If the length difference is large, we will regenerate it.

Egocentric Navigation: We randomly generate the candidates from move forward, stop, turn left and move forward, turn right and move forward.

Feb 16 '24 07:02 Andy1621

@Andy1621 I can't find question annotations, as well as options you mentioned. I navigated the datasets, but the question annotations were not in the original datasets. At least It seems that Fine-grained Pose has limited set of questions (like, Which one of these descriptions correctly matches the actions in the video?) but the others (MovieNet, VLN-CE) don't. Also, there's no annotation for NTU RGBD at all. Did you annotate the data with watching videos by yourself?

Feb 16 '24 08:02 cmh1027

For the questions, we generate it by ChatGPT~

Feb 16 '24 10:02 Andy1621

You can check our appendix for more details, like

Feb 16 '24 10:02 Andy1621

@Andy621 Could you provide the prompt for MovieNet and VLN-CE?

Feb 16 '24 10:02 cmh1027

I don't save the specific prompt... I remember that I required ChatGPT to generate some basic questions.

Feb 16 '24 11:02 Andy1621

For example in scene_transition,

        "video": "Top006_08310.mp4",
        "question": "Which choice matches the scene changes in the video?",
        "candidates": [
            "From the kitchen to the dining room.",
            "From the staircase to the gangway.",
            "From the bedroom to the bathroom.",
            "From the classroom to the library."
        ],
        "answer": "From the staircase to the gangway."

I wonder where the answer "From the staircase to the gangway" came from. There's no related data in the original MovieNet dataset

Feb 16 '24 11:02 cmh1027

In addition, It seems that MovieNet currently does not provide video data due to copyright issue. Where did you get videos like Top006_08310.mp4?

Feb 16 '24 11:02 cmh1027

Good question! It's a wrong cite when preparing the paper. We actually use the videos in MoVQA, we will fix it later.

Feb 16 '24 12:02 Andy1621

I see. It seems the dataset hasn't released yet. Is there any plan to release it? Or could you provide me with the dataset?

Feb 16 '24 12:02 cmh1027

Since I'm not the author, you can email the authors for more details.

Feb 16 '24 12:02 Andy1621

In action sequence task of STAR dataset, the paper states that it directly adopts QA of original dataset. However, annotation is quite different from the original one. In the annotation of MVbench, more than half questions start with "What happened after ~?" Meanwhile, I can't find corresponding questions which start with such sentence in STAR dataset annotation. Can you clarify this discrepancy?

Feb 18 '24 16:02 cmh1027

Please check those Sequence data in STAR. We do not use those QA contains to the which is about object.

Feb 19 '24 02:02 Andy1621

Ask-Anything Ask-Anything copied to clipboard

Can you provide scripts for generating question & candidates using ChatGPT?

Ask-Anything
Ask-Anything copied to clipboard