POINTER icon indicating copy to clipboard operation
POINTER copied to clipboard

About News Dataset

Open JiamingUWU opened this issue 4 years ago • 2 comments

In your paper, The EMNLP2017 WMT News dataset5 contains 268,586 sentences, but there are lots of datasets in url http://www.statmt.org/wmt17/ and I have no sense which one is the dataset used in experiments. I'd be appreciated if you provide some details.

JiamingUWU avatar Nov 30 '20 06:11 JiamingUWU

We use the news data obtained from https://github.com/pclucas14/GansFallingShort/tree/master/real_data_experiments/data/news

guoyinwang avatar Dec 03 '20 07:12 guoyinwang

sorry to disturb you. How is the data set divided into training set , dev set and test set?

JiamingUWU avatar Dec 19 '20 05:12 JiamingUWU