Some Questions about the Training Process
Hi Yunpeng, I am new to video recognition tasks. I ran the code and have some questions about the whole procedure.
-
For training, do you randomly sample 16 frames from the whole video to do classification? And each time it may be different 16 frames for the same video?
-
When I was trying to run the codes train_hmdb51, there are many logs like 'frame[30] is error, use backup item XXX.avi'. What does this mean? Does this mean that there are some errors in my video data?(I downloaded it from the official website)
-
It seems that the train_hmdb51 is doing both training and evaluation after each epoch. So why do we need another evaluation code like evaluate_video.py to do test?
Thanks a lot for your help!
Hi @VectorYoung ,
-
Yes, it uses random sampling.
-
It means the data loader cannot correct extract "frame 30" from that ".avi" file. It is either caused by the corrupted video file or simple because current version of the data-loader cannot well handle that particular video file. If such error raised, the data loader will try to load a backup video as current training sample, so that the program can keep going. The backup video is randomly selected from previous succeeded sampled video clip. Regarding the HMDB51 dataset, I personally first convert the whole dataset with "ffmpeg -c:v mpeg4" (keep original resolution) and this procedure can somehow help the data-loader successfully load all videos without any warning/error.
-
The testing/evaluation strategy is different. During the training, the accuracy is corresponding to the clip level prediction, where the program randomly sample a short clip and make a prediction for that clip. The clip-level prediction is then treated as the prediction for the entire video. However, "evaluate_video.py" sample multiple clips, average their results and use the aggregated results as the prediction for the entire video, thus is much more accuracy. But, the better result comes with a very high computational cost and it not affordable during training in my case.
Thanks for trying our code and sorry for the late reply.
Hi @cypw , Thanks a lot for your reply. I am trying to train on Kinetics 400 and I found that reading from the original .mp4 videos is very slow. I found that you have a script to convert it to .avi and I try it. But some videos failed. And even more videos are converted to .avi but have no frames or frame[0] is error(I see from the training log). Do you encounter the same issue? I am trying to find how to properly process and read the data. Thanks a lot for your help.