Refactoring-Summarization icon indicating copy to clipboard operation
Refactoring-Summarization copied to clipboard

Refactoring-Summarization

Code for our paper: "RefSum: Refactoring Neural Summarization", NAACL 2021.

We present a model, Refactor, which can be used either as a base system or a meta system for text summarization.

Outline

1. How to Install

Requirements

  • python3
  • conda create --name env --file spec-file.txt
  • pip3 install -r requirements.txt

Description of Codes

  • main.py -> training and evaluation procedure
  • model.py -> Refactor model
  • data_utils.py -> dataloader
  • utils.py -> utility functions
  • demo.py -> off-the-shelf refactoring

2. How to Run

Hyper-parameter Setting

You may specify the hyper-parameters in main.py.

Train

python main.py --cuda --gpuid [list of gpuid] -l

Fine-tune

python main.py --cuda --gpuid [list of gpuid] -l --model_pt [model path]

Evaluate

python main.py --cuda --gpuid [single gpu] -e --model_pt [model path] --model_name [model name]

3. Off-the-shelf Refactoring

You may use our model with you own data by running

python demo.py DATA_PATH MODEL_PATH RESULT_PATH

DATA_PATH is the path of you data, which should be a file of which each line is in json format: {"article": str, "summary": str, "candidates": [str]}.

RESULT_PATH is the path of the result of which each line is a candidate summary.

4. Data

We use four datasets for our experiments.

  • CNN/DailyMail -> https://github.com/abisee/cnn-dailymail
  • XSum -> https://github.com/EdinburghNLP/XSum
  • PubMed -> https://github.com/armancohan/long-summarization
  • WikiHow -> https://github.com/mahnazkoupaee/WikiHow-Dataset

You can find the processed data for all of our experiments here. After downloading, you should put the data in ./data directory.

Dataset Experiment Link
CNNDM Pre-train Download
BART Reranking Download
GSum Reranking Download
Two-system Combination (System-level) Download
Two-system Combination (Sentence-level) Download
Three-system Combination (System-level) Download
XSum Pre-train Download
PEGASUS Reranking Download
PubMed Pre-train Download
BART Reranking Download
WikiHow Pre-train Download
BART Reranking Download

5. Results

CNNDM

Reranking BART

ROUGE-1 ROUGE-2 ROUGE-L
BART 44.26 21.12 41.16
Refactor 45.15 21.70 42.00

Reranking GSum

ROUGE-1 ROUGE-2 ROUGE-L
GSum 45.93 22.30 42.68
Refactor 46.18 22.36 42.91

System-Combination (BART and pre-trained Refactor)

ROUGE-1 ROUGE-2 ROUGE-L
BART 44.26 21.12 41.16
pre-trained Refactor 44.13 20.51 40.29
Summary-Level Combination 45.04 21.61 41.72
Sentence-Level Combination 44.93 21.48 41.42

System-Combination (BART, pre-trained Refactor and GSum)

ROUGE-1 ROUGE-2 ROUGE-L
BART 44.26 21.12 41.16
pre-trained Refactor 44.13 20.51 40.29
GSum 45.93 22.30 42.68
Summary-Level Combination 46.12 22.46 42.92

XSum

Reranking PEGASUS

ROUGE-1 ROUGE-2 ROUGE-L
PEGASUS 47.12 24.46 39.04
Refactor 47.45 24.55 39.41

PubMed

Reranking BART

ROUGE-1 ROUGE-2 ROUGE-L
BART 43.42 15.32 39.21
Refactor 43.72 15.41 39.51

WikiHow

Reranking BART

ROUGE-1 ROUGE-2 ROUGE-L
BART 41.98 18.09 40.53
Refactor 42.12 18.13 40.66