pytorch-i3d
pytorch-i3d copied to clipboard
Normalisation steps
@piergiaj: Thanks for sharing the implementation in pytorch. Seems like in your codes, the only normalisation step performed is center_crop (224px). Don't we need the images to be mean_substracted by imagenet_mean and etc?
Also, can one extract features for 16 frames instead if 64 frames, does that drop the performance?
best, Vivek
@piergiaj : Precisely my issue is, in your code I see that you only take center crop, while normalisation is not handled. Some other guys are having troubles with i3d-model too. Thanks anyways.
Based on the implementation details in that paper, the only normalization done is to rescale everything to [-1,1]. I do handle that in
https://github.com/piergiaj/pytorch-i3d/blob/05783d11f9632b25fe3d50395a9c9bb51f848d6d/charades_dataset.py#L37
and
https://github.com/piergiaj/pytorch-i3d/blob/05783d11f9632b25fe3d50395a9c9bb51f848d6d/charades_dataset.py#L54-L55
I'm not sure if using imagenet means and standard deviations would provide different performance.
Yes, you can extract features for different numbers of frames, however the performance does drop if you use less frames. I don't have the exact numbers at the moment, but I can confirm that they did drop when I used less.
Let me know if you have any other questions.
@piergiaj : Thanks for the explanation.
I3D is an extension of ImageNet based model. So I wonder, why means and std is not taken into account. This should definitely influence the model training and performance. But anyways, thanks! :)
Yes, for flow I know this version is correct because the model is fine-tuned with this normalization.
For the RGB, it isn't clear. It isn't mentioned in the paper or the author's implementation and I forgot to ask the authors about that. I will say, if you finetune a model using this normalization, it won't matter much as the weights will adjust. However, it may matter for kinetics pre-trained models. When I have some free time, I will check this.
@piergiaj : I just wanted to extract features, that is why I would like to know the right steps. Also, the link I posted above, makes me a bit more concerned about it. I'll update you, if I found the right setup steps.
@lixzhang fyi
@vivoutlaw I wonder if you've been able to determine if the best image normalisation for this model?