dual_view_review_sum icon indicating copy to clipboard operation
dual_view_review_sum copied to clipboard

Generated Summary

Open Hannibal046 opened this issue 3 years ago • 6 comments

Hello, thanks for your great work ! I am wondering would you please shared the generated summaries for rouge calculation or model checkpoint ? Thanks so much !

Hannibal046 avatar Dec 14 '21 05:12 Hannibal046

The followings are our generated summaries for each dataset. Each zip file contains five sets of summaries generated by our model using different random seeds.

kenchan0226 avatar Dec 14 '21 06:12 kenchan0226

Hi, thanks so much for fast response! After looking through the home_min_4_multi_view_multi_task_basic_DecStateIn_2enc_residual_2multi_hop_g0.8_c0.1_i0.1_gdp0.0_check1k_joint_stop_h512_seed_520.20191004-111942, I have a question about the generated summary: there are 110 summaries that contain more than one line. Is this \n symbol automatically generated by the model itself ? image

Hannibal046 avatar Dec 14 '21 07:12 Hannibal046

Yes, the \n is automatically generated by the model itself.

kenchan0226 avatar Dec 14 '21 08:12 kenchan0226

so you make \n in your vocabulary and create ground truth by '\n'.join(summary_ls) when training?( since the summary format in the dataset is a list of string.)

Hannibal046 avatar Dec 14 '21 09:12 Hannibal046

A reference summary in the datasets is simply stored as a string and some of the reference summaries contain the \n token. We did not manually add any \n token into the reference summary.
The \n token is in our vocabulary since we extract the most frequent 50k words in the training set as our vocabulary. During inference, we split the predicted summary (and the ground-truth summary) into a list of sentences because it is required by the pyrouge script.

kenchan0226 avatar Dec 14 '21 09:12 kenchan0226

Ok, thanks so much for your detailed explanation !

Hannibal046 avatar Dec 14 '21 12:12 Hannibal046