canard
canard copied to clipboard
Repo for the question-in-context rewriting baseline presented in Elgohary et al. "Can you unpack that? Learning to rewrite questions-in-context", EMNLP 2019.
CANARD Rewriting Models
The repo is used to maintain scripts for training models for the question-in-context rewriting task introduced in
Ahmed Elgohary, Denis Peskov, Jordan Boyd-Graber. 2019. Can you unpack that? Learning to rewrite questions-in-context. In Empirical Methods in Natural Language Processing.
The CANARD dataset can be downloaded from the dataset page.
Pointer-generator sequence-to-sequence model
To run the model:
- Install Spacy.
- Clone and install OpenNMT-py.
- Download GloVE 840B.300d embeddings.
- Run
./preprocess.sh
to generate sequence-to-sequence format of the dataset. - Run
./ONMT_Pipeline_GloVE.sh
to train and evaluate the model.
A trained model can be downloaded using this link. The model achieves a 51.54 BLEU score on the dev set and 50.00 on the test set.