keras-video-classifier
keras-video-classifier copied to clipboard
Regarding reshaping the input data for action recognition
Hi. Thank you for your great work.
I am working on Human Action Recognition and I am using the UCF101 dataset. To simply the work a bit, I am using the top-20 actions of the dataset. So far, I have used a pre-trained VGG16 network to extract the features out of the individual frames extracted from the videos. The final shape I got from the VGG16 network is (20501, 7, 7, 512) (for the train set). I now want to pass these extracted features to an LSTM-based network and I am a bit confused as to how I should reshape it?
How many time steps should I pass in and also how many features in one time-step?
Thank you in advance for your help and time :)