DGAM-Weakly-Supervised-Action-Localization
DGAM-Weakly-Supervised-Action-Localization copied to clipboard
NUM_INPUT_FRAMES
Hi, thank you very much for sharing your code! Quick question about I3D features.
I notice the config.DATASET.NUM_INPUT_FRAMES is set to16. When I borrow the released codes in https://github.com/piergiaj/pytorch-i3d for extracting I3D features, the I3D network takes every 8-frame chuncks as input for both two streams. Thus, the final features are sampled by 8. I wonder if there is any process to get the feature sampled by 16.
Hi, thanks for the interest.
The I3D feature we use are extracted every 16 frames. The official repo is https://github.com/deepmind/kinetics-i3d
But if you are using 8-frame features, then I think you can safely change the config.DATASET.NUM_INPUT_FRAMES to 8.