pcshih

Results 10 issues of pcshih

I can only get about 76% on UCF-101 split 1 testing dataset and the model seems overfitting... How can I fix the overfitting problem? ![loss](https://user-images.githubusercontent.com/48461528/72399005-d9a40200-377f-11ea-9ac4-568eb83ecd56.JPG) ![acc](https://user-images.githubusercontent.com/48461528/72399006-d9a40200-377f-11ea-9a35-2430bb6269a6.JPG)

https://github.com/KaiyangZhou/pytorch-vsumm-reinforce/blob/fdd03be93f090278424af789c120531e49aefa40/main.py#L164 Why tvsum use avg but summe use max? Thank you very much.

As mention in the paper, the training and testing set should be 80% and 20%. But in https://github.com/weirme/Video_Summary_using_FCSN/blob/96b40851b7805afd1f1fc69f2beb5143d5727b4e/data_loader.py#L25 should it be `train_dataset, test_dataset = torch.utils.data.random_split(dataset, [int(len(dataset)*0.8), int(len(dataset)*0.2)])`? Thank you.

https://github.com/weirme/Video_Summary_using_FCSN/blob/0895cccbb2a488369b1bfc7d2c087b3050250898/make_dataset.py#L70 What is the meaning of this function?

As in [FCSN](http://openaccess.thecvf.com/content_ECCV_2018/papers/Mrigank_Rochan_Video_Summarization_Using_ECCV_2018_paper.pdf) Table 1, they use [this paper](http://www-scf.usc.edu/~weilunc/paper/ECCV_VS_supp.pdf) 1.3 to convert frame-level scores to keyframes. But you use [this](https://github.com/weirme/Video_Summary_using_FCSN/blob/5ee2ca690a35c3078715def8420a8b37863973f0/make_dataset.py#L71) method to get keyframe which seems not identical to the...

https://github.com/weirme/Video_Summary_using_FCSN/blob/0895cccbb2a488369b1bfc7d2c087b3050250898/make_dataset.py#L53-L56 Because googlenet is only for feature extraction, it should be in eval mode. [Remember that you must call model.eval() to set dropout and batch normalization layers to evaluation mode...

https://github.com/weirme/Video_Summary_using_FCSN/blob/0895cccbb2a488369b1bfc7d2c087b3050250898/make_dataset.py#L49 These values are for imagenet dataset. Does it also fit the dataset we use here?

The original frame feature shape is [[320,1024]](https://github.com/weirme/Video_Summary_using_FCSN) But the code https://github.com/weirme/Video_Summary_using_FCSN/blob/96b40851b7805afd1f1fc69f2beb5143d5727b4e/data_loader.py#L18 wants to reshape to [1024,320] directly. Should it use transpose instead of reshape? Thank you.

https://github.com/weirme/Video_Summary_using_FCSN/blob/0895cccbb2a488369b1bfc7d2c087b3050250898/eval.py#L24 The [start:end] operator excludes the end element should it be ` pred_value = np.array([pred_score[cp[0]:(cp[1]+1)].mean() for cp in cps])` ? Also https://github.com/weirme/Video_Summary_using_FCSN/blob/0895cccbb2a488369b1bfc7d2c087b3050250898/eval.py#L29