GloballyNormalizedReader icon indicating copy to clipboard operation
GloballyNormalizedReader copied to clipboard

An implementation of the Globally Normalized Reader

Globally Normalized Reader

This repository contains the code used in the following paper:

Jonathan Raiman and John Miller. Globally Normalized Reader. Empirical Methods in Natural Language Processing (EMNLP), 2017.

If you use the dataset/code in your research, please cite the above paper:

    author={Raiman, Jonathan and Miller, John},
    booktitle={Empirical Methods in Natural Language Processing (EMNLP)},
    title={Globally Normalized Reader},

Note: This repository is a reimplementation of the original used for the above paper. The original used a batch size of 32 and synchronous-SGD across multiple GPUs. However, this code currently only runs on a single GPU and will split a batch that runs out of memory into several smaller batches. For this reason, the code does not exactly reproduce the results in that paper (but it should be <2% off). Work is underway to rectify this issue.

Usage (TensorFlow)


You must have installed and available the following libraries:

  • CUDA 8.0.61 or higher, with appropriate drivers installed for the available GPUs.
  • CuDNN v6.0 or higher.

Make sure you know where the aforementioned libraries are located on your system; you will need to adjust the paths you use to point to them.


  1. Set up your environment variables

    # Copy this into ~/.zshrc or ~/.bashrc for regular use.

    If you are not running on the SVAIL cluster, you will need to change these variables.

  2. Create your virtual environment:

    python3.6 -m venv env

    Python 3.6 must be on your command-line PATH, which is set up automatically by above.

  3. Activate your virtual environment:

    # You will need to do this every time you use the GNR
    source env/bin/activate
  4. Install numpy, separately from the other packages

    pip install numpy
  5. Install all dependencies from requirements.txt

    pip install -r requirements.txt


Before training the Globally Normalized Reader, you need to download and featurize the dataset.

  1. Download all the necessary data:

    cd data && ./ && cd ..
    wget -O $GLOVE_PATH
  2. Featurize all of the data:

    python --datadir data --outdir featurized  --glove-path $GLOVE_PATH


  1. Create a new model:

    python create --name default --vocab-path featurized/
  2. Train the model:

    python train --name default --data featurized/


  1. Evaluate the model:
    python predict --name default --data data/dev.json --vocab-path featurized/ --output predictions.txt

Usage (PaddlePaddle)

  1. Install the latest GPU-compatible PaddlePaddle Docker image, as directed on the PaddlePaddle website.
  2. To print the model configuration as text, use
  3. To train the model, use
  4. To run inference the model, use