nb-clean icon indicating copy to clipboard operation
nb-clean copied to clipboard

Clean Jupyter notebooks of outputs, metadata, and empty cells, with Git integration

Licence GitHub release PyPI version Python versions CI status Coverage

nb-clean cleans Jupyter notebooks of cell execution counts, metadata, outputs, and (optionally) empty cells, preparing them for committing to version control. It provides a Git filter to automatically clean notebooks before they're staged, and can also be used with other version control systems, as a command line tool, and as a Python library. It can determine if a notebook is clean or not, which can be used as a check in your continuous integration pipelines.

:warning: nb-clean 2.0.0 introduced a new command line interface to make cleaning notebooks in place easier. If you upgrade from a previous release, you'll need to migrate to the new interface as described under Migrating to nb-clean 2.

Installation

To install the latest release from PyPI, use pip:

python3 -m pip install nb-clean

Alternately, in Python projects using Poetry or Pipenv for dependency management, add nb-clean as a development dependency with poetry add --dev nb-clean or pipenv install --dev nb-clean. nb-clean requires Python 3.7 or later.

Usage

Cleaning

To add a filter to an existing Git repository to automatically clean notebooks when they're staged, run the following from the working tree:

nb-clean add-filter

This will configure a filter to remove cell execution counts, metadata, and outputs. To also remove empty cells, use:

nb-clean add-filter --remove-empty-cells

To preserve cell metadata, such as that required by tools such as papermill, use:

nb-clean add-filter --preserve-cell-metadata

nb-clean will configure a filter in the Git repository in which it is run, and won't mutate your global or system Git configuration. To remove the filter, run:

nb-clean remove-filter

Aside from usage from a filter in a Git repository, you can also clean up a Jupyter notebook with:

nb-clean clean notebook.ipynb

This cleans the notebook in place. You can also pass the notebook content on standard input, in which case the cleaned notebook is written to standard output:

nb-clean clean < original.ipynb > cleaned.ipynb

To also remove empty cells, add the -e/--remove-empty-cells flag. To preserve cell metadata, add the -m/--preserve-cell-metadata flag.

Checking

You can check if a notebook is clean with:

nb-clean check notebook.ipynb

or by passing the notebook contents on standard input:

nb-clean check < notebook.ipynb

To also check for empty cells, add the -e/--remove-empty-cells flag. To ignore cell metadata, add the -m/--preserve-cell-metadata flag.

nb-clean will exit with status code 0 if the notebook is clean, and status code 1 if it is not. nb-clean will also print details of cell execution counts, metadata, outputs, and empty cells it finds.

Migrating to nb-clean 2

The following table maps from the command line interface of nb-clean 1.6.0 to that of nb-clean 2.0.0.

Description nb-clean 1.6.0 nb-clean 2.0.0
Clean notebook nb-clean clean -i/--input notebook.ipynb | sponge notebook.ipynb nb-clean clean notebook.ipynb
Clean notebook (remove empty cells) nb-clean clean -i/--input notebook.ipynb -e/--remove-empty nb-clean clean -e/--remove-empty-cells notebook.ipynb
Clean notebook (preserve cell metadata) nb-clean clean -i/--input notebook.ipynb -m/--preserve-metadata nb-clean clean -m/--preserve-cell-metadata notebook.ipynb
Check notebook nb-clean check -i/--input notebook.ipynb nb-clean check notebook.ipynb
Check notebook (remove empty cells) nb-clean check -i/--input notebook.ipynb -e/--remove-empty nb-clean check -e/--remove-empty-cells notebook.ipynb
Check notebook (preserve cell metadata) nb-clean check -i/--input notebook.ipynb -m/--preserve-metadata nb-clean check -m/--preserve-cell-metadata notebook.ipynb
Add Git filter to clean notebooks nb-clean configure-git nb-clean add-filter
Remove Git filter nb-clean unconfigure-git nb-clean remove-filter

Copyright

Copyright © 2017-2022 Scott Stevenson.

nb-clean is distributed under the terms of the ISC licence.