Video-Classification-2-Stream-CNN Where is 2 stream itself?

Where is 2 stream itself?

Open sudonto opened this issue 6 years ago • 4 comments

Hi @wadhwasahil , @stillbreeze ,

As stated in the project's title, I suppose to see your model/network in 2 streams (2 inputs) but I only see spatial and temporal model in seperate network (i.e. you did not merge these models into 1 big model). Is it an unfinished project or is this your intention?

Thank you.

Sep 15 '18 15:09 sudonto

The repo is a re-implementation of the 2 stream CNN paper by Simonyan et al (https://papers.nips.cc/paper/5353-two-stream-convolutional-networks-for-action-recognition-in-videos.pdf).

The paper trains the streams individually and does late fusion with the softmax through averaging or learning an SVM on normalized features. So there's no merging into a bigger model.

Sep 15 '18 18:09 stillbreeze

Ah, so far I have been wrong in understanding the term of "late fusion". So, in which part of the code did you fuse with softmax then average the result? Also, is it possible to jointly train the network for both spatial and temporal and then achieve the same result?

Sep 16 '18 08:09 sudonto

@sudonto yes you can train them jointly but due to lack of resources we had to train them separately.

Sep 17 '18 07:09 wadhwasahil

Thank you for the answer. So, in which part of the code did you compute the average of softmax score to get the class accuracy? Sorry for asking too much.

Sep 17 '18 07:09 sudonto

Video-Classification-2-Stream-CNN Video-Classification-2-Stream-CNN copied to clipboard

Where is 2 stream itself?

Video-Classification-2-Stream-CNN
Video-Classification-2-Stream-CNN copied to clipboard