unsupervised-paraphrase-generation
unsupervised-paraphrase-generation copied to clipboard
"Unsupervised Paraphrase Generation using Pre-trained Language Model."
Unsupervised Paraphrase Generation
Unsupervised Paraphrase Generation using Pre-trained Language Model (paper)
We use Huggingface transformers 3.0.2, pytorch 1.6.0, and python 3.8 and only support GPU version
Please check environment.yml for detail.
How to run
-
Environment setup
conda env create -f environment.yml conda activate huggingface -
Download dataset
Please check the
data/README.mdfor how to download and preprocess dataset -
Training (Finetune GPT-2)
bash run_train.sh {GPU_ID} {TAG}For example,
bash run_train.sh "0,1,2,3" "training_with_4_GPUs" -
Evaluation
bash run_eval.sh {GPU_ID} {TAG} {CHECKPOINT_DIR}For example,
bash run_eval.sh 0 "training_with_4_GPUs" "checkpoints/gpt2-medium_training_with_4_GPUs_2020-08-28_12:34:56/checkpoint-3000/"
Notice
- I don't reproduce the results yet. Please feel free to comment for reproduction by creating an issue or a pull request.
- The experiment on QQP dataset is implemented now. Experiments on SST-2 will be added later.