FCSN icon indicating copy to clipboard operation
FCSN copied to clipboard

implementation of get_oracle_summary function

Open pcshih opened this issue 5 years ago • 28 comments

https://github.com/weirme/Video_Summary_using_FCSN/blob/0895cccbb2a488369b1bfc7d2c087b3050250898/make_dataset.py#L70

What is the meaning of this function?

pcshih avatar Aug 06 '19 08:08 pcshih

This function generates a summary from summary of 20 users in the dataset.

weirme avatar Aug 06 '19 10:08 weirme

The implementation is based on which paragraph of FSCN paper or other paper?

pcshih avatar Aug 06 '19 12:08 pcshih

Chapter 3.1 of this paper: Diverse sequential subset selection for supervised video summarization. In my implementation, the greedy algorithm selects the frame marked by the most users each time.

weirme avatar Aug 07 '19 01:08 weirme

After reading Chapter 3.1, I still cannot realize the process. Given 3 human summaries with 5 frames: A: [1,0,1,1,0] B: [0,0,1,0,0] C: [0,0,0,1,0]

How to get the final summary? First: calculate the select times of each frame -> [1,0,2,2,0] Second: I have no idea...

pcshih avatar Aug 07 '19 02:08 pcshih

In my implementation, initialize oracle summary as [0, 0, 0, 0, 0], and then pick the most selected frame (here the third), now the oracle summary will be [0, 0, 1, 0, 0]. Determine if the F-score between oracle summary and user summary increases after adding this frame. If true, continue to select next frame, otherwise it ends. But it is just my implementation, I didn't find a specific description of the greedy algorithm used in the paper. So I'm not sure if the algorithm is like this.

weirme avatar Aug 07 '19 03:08 weirme

Where is FCSN mentioned that they use "Diverse sequential subset selection for supervised video summarization" for generating a summary from summary of users?

pcshih avatar Aug 07 '19 03:08 pcshih

This method is mentioned in supplementary materials of paper Video Summarization with Long Short-term Memory.

weirme avatar Aug 07 '19 04:08 weirme

After I read the paragraph, I implement it.

https://github.com/pcshih/pytorch-FCSN/blob/7d4f874f6c71d5b279b6e26a6ee4882460230fc9/make_dataset.py#L84

Is my understanding identical to yours?

But the performance is quite bad...

pcshih avatar Aug 07 '19 06:08 pcshih

Have you print the final F-score between generated oracle summary and user summary?

weirme avatar Aug 07 '19 06:08 weirme

Did you mean the parameter "best_fscore"?

pcshih avatar Aug 07 '19 07:08 pcshih

best_fscore_1 best_fscore_2

It seems slightly different.

pcshih avatar Aug 07 '19 07:08 pcshih

https://github.com/KaiyangZhou/pytorch-vsumm-reinforce/blob/fdd03be93f090278424af789c120531e49aefa40/main.py#L164

I found that tvsum use avg but summe use max when evaluating. After I change summe to max, my result gets better.

But I do not know why to use this method...

FCSN_1D_summe_eval_max

pcshih avatar Aug 07 '19 07:08 pcshih

Could you share the tvsum video on your google drive? tvsum needs authorization....

pcshih avatar Aug 07 '19 08:08 pcshih

https://github.com/KaiyangZhou/pytorch-vsumm-reinforce/blob/fdd03be93f090278424af789c120531e49aefa40/main.py#L164

I found that tvsum use avg but summe use max when evaluating. After I change summe to max, my result gets better.

But I do not know why to use this method...

FCSN_1D_summe_eval_max

Is this result on SumMe? It seems close to that in paper!

weirme avatar Aug 07 '19 09:08 weirme

Could you share the tvsum video on your google drive? tvsum needs authorization....

Wait a moment, I'm now uploading it...

weirme avatar Aug 07 '19 09:08 weirme

https://github.com/KaiyangZhou/pytorch-vsumm-reinforce/blob/fdd03be93f090278424af789c120531e49aefa40/main.py#L164 I found that tvsum use avg but summe use max when evaluating. After I change summe to max, my result gets better. But I do not know why to use this method... FCSN_1D_summe_eval_max

Is this result on SumMe? It seems close to that in paper!

Yes, it is summe.

pcshih avatar Aug 07 '19 09:08 pcshih

Could you share the tvsum video on your google drive? tvsum needs authorization....

Wait a moment, I'm now uploading it...

Thank you

pcshih avatar Aug 07 '19 09:08 pcshih

Here is the link.

weirme avatar Aug 07 '19 10:08 weirme

Got it. Thank you very much. Did you figure out ? https://github.com/KaiyangZhou/pytorch-vsumm-reinforce/blob/fdd03be93f090278424af789c120531e49aefa40/main.py#L164

pcshih avatar Aug 07 '19 11:08 pcshih

May be it is a default setting in evaluation? I also think it's strange... And I noticed that selected key frames of videos in summe differ greatly from each user, F-score between generated oracle summary and user summary is only nearly 50%, but that is nearly 70% in tvsum. In this case, getting a summary close to every user seems to be difficult. Is this probably a reason to select max?

weirme avatar Aug 07 '19 12:08 weirme

I agree with your opinion. Let's take this evaluation method for granted. I also implement this paper which architecture is based on FCSN but there are some problems...

pcshih avatar Aug 07 '19 13:08 pcshih

I have not read this paper yet, its architecture looks complicated.

weirme avatar Aug 07 '19 13:08 weirme

Do you have any idea of FCSN in unsupervised version?

pcshih avatar Aug 08 '19 06:08 pcshih

No... I skip that part when reading the paper...

weirme avatar Aug 08 '19 10:08 weirme

Shall we implement that part?

pcshih avatar Aug 08 '19 10:08 pcshih

I will try to implement it after reading that part, but there may be some problems because my computer at home doesn't have a nvidia gpu :sweat_smile::sweat_smile:

weirme avatar Aug 08 '19 13:08 weirme

I am counting on you.

pcshih avatar Aug 08 '19 13:08 pcshih

Here is the link.

Thanks for this.

Pager07 avatar Oct 19 '20 14:10 Pager07