Refactoring-Summarization
Refactoring-Summarization copied to clipboard
Refactoring-Summarization
Code for our paper: "RefSum: Refactoring Neural Summarization", NAACL 2021.

We present a model, Refactor, which can be used either as a base system or a meta system for text summarization.
Outline
1. How to Install
Requirements
-
python3
-
conda create --name env --file spec-file.txt
-
pip3 install -r requirements.txt
Description of Codes
-
main.py
-> training and evaluation procedure -
model.py
-> Refactor model -
data_utils.py
-> dataloader -
utils.py
-> utility functions -
demo.py
-> off-the-shelf refactoring
2. How to Run
Hyper-parameter Setting
You may specify the hyper-parameters in main.py
.
Train
python main.py --cuda --gpuid [list of gpuid] -l
Fine-tune
python main.py --cuda --gpuid [list of gpuid] -l --model_pt [model path]
Evaluate
python main.py --cuda --gpuid [single gpu] -e --model_pt [model path] --model_name [model name]
3. Off-the-shelf Refactoring
You may use our model with you own data by running
python demo.py DATA_PATH MODEL_PATH RESULT_PATH
DATA_PATH
is the path of you data, which should be a file of which each line is in json format: {"article": str, "summary": str, "candidates": [str]}
.
RESULT_PATH
is the path of the result of which each line is a candidate summary.
4. Data
We use four datasets for our experiments.
- CNN/DailyMail -> https://github.com/abisee/cnn-dailymail
- XSum -> https://github.com/EdinburghNLP/XSum
- PubMed -> https://github.com/armancohan/long-summarization
- WikiHow -> https://github.com/mahnazkoupaee/WikiHow-Dataset
You can find the processed data for all of our experiments here. After downloading, you should put the data in ./data
directory.
Dataset | Experiment | Link |
---|---|---|
CNNDM | Pre-train | Download |
BART Reranking | Download | |
GSum Reranking | Download | |
Two-system Combination (System-level) | Download | |
Two-system Combination (Sentence-level) | Download | |
Three-system Combination (System-level) | Download | |
XSum | Pre-train | Download |
PEGASUS Reranking | Download | |
PubMed | Pre-train | Download |
BART Reranking | Download | |
WikiHow | Pre-train | Download |
BART Reranking | Download |
5. Results
CNNDM
Reranking BART
ROUGE-1 | ROUGE-2 | ROUGE-L | |
---|---|---|---|
BART | 44.26 | 21.12 | 41.16 |
Refactor | 45.15 | 21.70 | 42.00 |
Reranking GSum
ROUGE-1 | ROUGE-2 | ROUGE-L | |
---|---|---|---|
GSum | 45.93 | 22.30 | 42.68 |
Refactor | 46.18 | 22.36 | 42.91 |
System-Combination (BART and pre-trained Refactor)
ROUGE-1 | ROUGE-2 | ROUGE-L | |
---|---|---|---|
BART | 44.26 | 21.12 | 41.16 |
pre-trained Refactor | 44.13 | 20.51 | 40.29 |
Summary-Level Combination | 45.04 | 21.61 | 41.72 |
Sentence-Level Combination | 44.93 | 21.48 | 41.42 |
System-Combination (BART, pre-trained Refactor and GSum)
ROUGE-1 | ROUGE-2 | ROUGE-L | |
---|---|---|---|
BART | 44.26 | 21.12 | 41.16 |
pre-trained Refactor | 44.13 | 20.51 | 40.29 |
GSum | 45.93 | 22.30 | 42.68 |
Summary-Level Combination | 46.12 | 22.46 | 42.92 |
XSum
Reranking PEGASUS
ROUGE-1 | ROUGE-2 | ROUGE-L | |
---|---|---|---|
PEGASUS | 47.12 | 24.46 | 39.04 |
Refactor | 47.45 | 24.55 | 39.41 |
PubMed
Reranking BART
ROUGE-1 | ROUGE-2 | ROUGE-L | |
---|---|---|---|
BART | 43.42 | 15.32 | 39.21 |
Refactor | 43.72 | 15.41 | 39.51 |
WikiHow
Reranking BART
ROUGE-1 | ROUGE-2 | ROUGE-L | |
---|---|---|---|
BART | 41.98 | 18.09 | 40.53 |
Refactor | 42.12 | 18.13 | 40.66 |