clarification_question_generation_pytorch icon indicating copy to clipboard operation
clarification_question_generation_pytorch copied to clipboard

Code and data for the paper: Answer-based Adversarial Training for Generating Clarification Questions

Repository information

This repository contains data and code for the paper below:

Answer-based Adversarial Training for Generating Clarification Questions
Sudha Rao ([email protected]) and Hal Daumé III ([email protected])
Proceedings of NAACL-HLT 2019

Downloading data

  • Download embeddings from and save them into the repository folder
  • Download data from Unzip the two folders inside and copy them into the repository folder

Training models on StackExchange dataset

  • To train an MLE model, run src/

  • To train a Max-Utility model, follow these three steps:

    • run src/

    • run src/

    • run src/

  • To train a GAN-Utility model, follow these three steps (note, you can skip first two steps if you have already ran them for Max-Utility model):

    • run src/

    • run src/

    • run src/

Training models on Amazon (Home & Kitchen) dataset

  • To train an MLE model, run src/

  • To train a Max-Utility model, follow these three steps:

    • run src/

    • run src/

    • run src/

  • To train a GAN-Utility model, follow these three steps (note, you can skip first two steps if you have already ran them for Max-Utility model):

    • run src/

    • run src/

    • run src/

Generating outputs using trained models

  • Run following scripts to generate outputs for models trained on StackExchange dataset:

    • For MLE model, run src/

    • For Max-Utility model, run src/

    • For GAN-Utility model, run src/

  • Run following scripts to generate outputs for models trained on Amazon dataset:

    • For MLE model, run src/

    • For Max-Utility model, run src/

    • For GAN-Utility model, run src/

Evaluating generated outputs

  • For StackExchange dataset, reference for a subset of the test set was collected using human annotators. Hence we first create a version of the predictions file for which we have references by running following: src/evaluation/

  • For Amazon dataset, we have references for all instances in the test set.

  • We remove <UNK> tokens from the generated outputs by simply removing them from the predictions file.

  • For BLEU score, run src/evaluation/

  • For METEOR score, run src/evaluation/

  • For Diversity score, run src/evaluation/ <predictions_file>