StyleFusion icon indicating copy to clipboard operation
StyleFusion copied to clipboard

EMNLP'19: "Structuring Latent Spaces for Stylized Response Generation"

StyleFusion

code/data for EMNLP'19 paper Structuring Latent Spaces for Stylized Response Generation.

Designed to build a stylized dialogue response generator, StyleFusion jointly learns from a conversational dataset and other formats of text (e.g., non-parallel, non-conversational stylized text dataset). In our EMNLP 2019 paper, we demonstrated its use to generate response in style of Sherlock Holmes and arXiv. StyleFusion is a generalized version of our previous work SpaceFusion.

More documents:

Dataset

In our paper, we trained the model using the following three datasets.

  • Reddit: the conversational dataset (base_conv), can be generated using this script.
  • Sherlock Holmes, one of style dataset (bias_nonc), avaialble here
  • arXiv, another style corpus (bias_nonc), can be obtained following instructions here
  • A toy dataset is provied as an example following the format described above.
  • We also provided the test data here

See here for more details and instructions.

Usage

  • to train a model python src/main.py train
  • to interact with a trained model python src/main.py cmd --restore=[path_to_model_file]
  • using the provided style classifiers
    • interactive demo: python src/classifier.py [fld_clf], where [fld_clf] is the folder where the classifier model exists, e.g., classifier/Reddit_vs_arXiv/neural
    • evaluate a tsv file: python src/classifier.py [fld_clf] [path_to_be_evaluated].

Citation

Please cite our EMNLP paper if this repo is useful to your work :)

@article{gao2019stylefusion,
  title={Structuring Latent Spaces for Stylized Response Generation},
  author={Gao, Xiang and Zhang, Yizhe and Lee, Sungjin and Galley, Michel and Brockett, Chris and Gao, Jianfeng and Dolan, Bill},
  journal={EMNLP 2019},
  year={2019}
}