nlg-dataset topic
RNNLG
RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems...
CommonGen
A Constrained Text Generation Challenge Towards Generative Commonsense Reasoning
RE-NLG-Dataset
T-Rex : A Large Scale Alignment of Natural Language with Knowledge Base Triples
ChessCommentaryGeneration
Harsh Jhamtani*, Varun Gangal*, Eduard Hovy, Graham Neubig, Taylor Berg-Kirkpatrick. Learning to Generate Move-by-Move Commentary for Chess Games from Large-Scale Social Forum Data. ACL 2018
spot-the-diff
EMNLP 2018. Learning to Describe Differences Between Pairs of Similar Images. Harsh Jhamtani, Taylor Berg-Kirkpatrick.
OrangeSum
The French summarization dataset introduced in "BARThez: a Skilled Pretrained French Sequence-to-Sequence Model".
Datasets
datasets with text data for use in NLP, Text analysis, information extraction, ML research.
recipe-personalization
EMNLP 2019: Generating Personalized Recipes from Historical User Preferences
bold
Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation" paper