kinetics-i3d
kinetics-i3d copied to clipboard
train from scratch on ucf101 dataset
We try to train i3d model on ucf101 from scratch, but it converges much slower with a final validation accuray around 60%. Can you offer some suggestions on train i3d model without imagenet pretrained model.
Hi,
i think 50-60% accuracy is to be expected when training I3D from scratch on RGB in UCF101. If you do the same on flow it should get ~80%. When averaging both we got 88% in the last version of the quo vadis paper.
In summary, i think your training setup should be fine.
Best,
Joao
On Wed, Jan 16, 2019 at 3:50 AM leviswind [email protected] wrote:
We try to train i3d model on ucf101 from scratch, but it converges much slower with a final validation accuray around 60%. Can you offer some suggestions on train i3d model without imagenet pretrained model.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/deepmind/kinetics-i3d/issues/46, or mute the thread https://github.com/notifications/unsubscribe-auth/AO6qaj792doQPF6DLeSuo3yYIm6AKfcOks5vDqGNgaJpZM4aCQxU .
How to train i3d with optical-flow with imagenet pretrained model? Can you offer some details of training on UCF101 dataset.
Also, what's the convergence speed should be when training optical-flow compared with rgb with imagenet pretraining @joaoluiscarreira
The way we did it was that we inflated the weights of the imagenet model into 3D, then trained the model normally from there, without freezing batch norm. I think you can find code online for training the model if you search on google. I tend to remember that the flow model converges faster but this was a long time ago.
Best,
Joao
On Thu, Jan 17, 2019 at 7:36 AM leviswind [email protected] wrote:
Also, what's the convergence speed should be when training optical-flow compared with rgb with imagenet pretraining
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/deepmind/kinetics-i3d/issues/46#issuecomment-455071420, or mute the thread https://github.com/notifications/unsubscribe-auth/AO6qameRG2etXHL7UwUIC5vL1isSGHVHks5vECgCgaJpZM4aCQxU .
However, the input channel of the first conv layer is 2 for flow data compared with 3 for rgb. How to deal with the difference? @joaoluiscarreira . I'm really appreciated for your help.
I think we just discarded the weights for one of the input channels in that first layer, before inflating.
Best,
Joao
On Thu, Jan 17, 2019 at 11:55 PM leviswind [email protected] wrote:
However, the input channel of the first conv layer is 2 for flow data compared with 3 for rgb. How to deal with the difference? @joaoluiscarreira https://github.com/joaoluiscarreira . I'm really appreciated for your help.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/deepmind/kinetics-i3d/issues/46#issuecomment-455376247, or mute the thread https://github.com/notifications/unsubscribe-auth/AO6qau9xAzRpEZOppvHgaG3JODy5j3pgks5vEQ2KgaJpZM4aCQxU .
Actually i went back to check what exactly we did and for that particular layer we average the original weights for the 3 input channels then copied it twice -- so the initial weights are the same for both flow input dimensions. But i think it did not make much of a difference compared to the other option.
Joao
On Fri, Jan 18, 2019 at 9:26 AM João Carreira [email protected] wrote:
I think we just discarded the weights for one of the input channels in that first layer, before inflating.
Best,
Joao
On Thu, Jan 17, 2019 at 11:55 PM leviswind [email protected] wrote:
However, the input channel of the first conv layer is 2 for flow data compared with 3 for rgb. How to deal with the difference? @joaoluiscarreira https://github.com/joaoluiscarreira . I'm really appreciated for your help.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/deepmind/kinetics-i3d/issues/46#issuecomment-455376247, or mute the thread https://github.com/notifications/unsubscribe-auth/AO6qau9xAzRpEZOppvHgaG3JODy5j3pgks5vEQ2KgaJpZM4aCQxU .
We suffered great overfitting when training i3d with optical-flow similar as training rgb data without imagenet pretraining. The test accuracy is only 50%. Have you met such problems? @joaoluiscarreira
As mentioned earlier in the thread, training from scratch on flow got close to 80%. You could try testing with batch statistics to see if there's some batch norm moving average problem.
On Sat, Jan 19, 2019 at 5:22 AM leviswind [email protected] wrote:
We suffered great overfitting when training i3d with optical-flow similar as training rgb data without imagenet pretraining. The test accuracy is only 50%. Have you met such problems? @joaoluiscarreira https://github.com/joaoluiscarreira
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/deepmind/kinetics-i3d/issues/46#issuecomment-455751135, or mute the thread https://github.com/notifications/unsubscribe-auth/AO6qaph7hvuSW91qD5DMfoONVt9O-aA0ks5vEquFgaJpZM4aCQxU .
@leviswind,Hi,can you train the i3d on ucf101 successfully? I want to use i3d on ucf101, How can I use i3d model to fine-tune on ucf101? where is the train code? can you give me some advice. Thanks Best wishes