gtad
gtad copied to clipboard
What does uNet_test.npy do?
Many thanks to the author for the excellent open source code. I have learned thatuNet_test.npy is the classification of each test video, but what is the specific function of uNet_test.npy in post-processing?
Hello @mrlihellohorld , please check the post-processing code that uses this file:
https://github.com/frostinassiky/gtad/blob/6deb5b1bc6883b48bd22e0cc593069643c953e3d/gtad_postprocess.py#L106
The video classification scores help to assign class labels and reweight the G-TAD prediction.
Hello @mrlihellohorld , please check the post-processing code that uses this file:
https://github.com/frostinassiky/gtad/blob/6deb5b1bc6883b48bd22e0cc593069643c953e3d/gtad_postprocess.py#L106
The video classification scores help to assign class labels and reweight the G-TAD prediction.
Thank you very much for your reply. I am puzzled by this sentence“The video classification scores help to assign class labels “. Because according to this code https://github.com/frostinassiky/gtad/blob/6deb5b1bc6883b48bd22e0cc593069643c953e3d/gtad_postprocess.py#L117 , I understand is to assign the tag of the whole video to the tag of the tmp_proposal,right?
Sorry for the confusing sentence. Your understand is correct.
1, To decide the class label (or tag), we need to find the top-k classification scores.
2, From the indexes of the top-k scores, we can decide the video tag by thumos_class
variable.
Please let me know if you have more questions. J
Sorry for the confusing sentence. Your understand is correct. 1, To decide the class label (or tag), we need to find the top-k classification scores. 2, From the indexes of the top-k scores, we can decide the video tag by
thumos_class
variable.Please let me know if you have more questions. J
Thank you very much for your patient answer. According to your answer, I have described my understanding and hope you can correct me,THX. a. ' From the indexes of the top-k scores, we can decide the video tag by thumos_class variable.' If I understand you correctly, here is the classification of tmp_proposal(snippet video),so the 'video' mean ‘snippet video’? b, In my own data set, each video has multiple actions. According to the data set, I first train a video classification model to get video level scores, and then classify each snippet according to the video scores. Do I understand that right?If correct, is it reasonable to classify snippet video by the whole video (including multiple actions)? I don't understand this part very well. I am looking forward to your reply. thanks!!!
Hello @mrlihellohorld
The video means the untrimmed video that includes the snippet.
If your video has multiple actions that belong to different classes, a reasonable setting is classifying those tmp_proposal
via a separate branch. A new classification score ( e.g. this line ) can be added on G-TAD to achieve this.
I don't quite understand you said. Could you please explain it more clearly? Thank you
Sorry I didn't explain the questions clearly. If you are very interested in G-TAD, we can schedule a time and chat about it. My WeChat ID is FrostXuMengmeng.
I am curious, did anyone actually implement action class prediction from another branch in gtad?