lightning-lab icon indicating copy to clipboard operation
lightning-lab copied to clipboard

hackable boilerplate for PyTorch Lightning driven deep learning research

Lightning Pod

codecov CircleCI

Overview

Lightning Pod is a template Python environment, tooling, and system architecture for artificial intelligence and machine learning projects that use PyTorch and PyTorch Lightning. Lightning Pod also offers an example Plotly Dash app.

Core AI/ML Ecosystem

These are the base frameworks. Many other tools (numpy, pyarrow etc) are installed as dependencies when installing the core dependencies.

  • pytorch
  • pytorch-lightning
  • torchmetrics
  • weights and biases
  • optuna
  • hydra
  • plotly
  • dash
Testing and Code Quality
  • PyTest
  • coverage
  • MyPy
  • Black
  • isort
  • pre-commit
Packaging
  • setuptools
  • build
  • twine
  • poetry
CI/CD
  • CircleCI
  • Deepsource
  • GitHub Actions
  • Mergify

Core Code

lightning_pod.core contains code for LightningModule and and the Trainer.

lightning_pod.pipeline contains code for data preprocessing, building a Torch Dataset, and LightningDataModule.

If you only need to process data and implement an algorithm from a paper or pseudcode, you can focus on lightning_pod.core and lightning_pod.pipeline and ignore the rest of the code, so long as you follow the basic class and function naming conventions I've provided.

Altering the naming conventions will cause the flow to break. Be sure to refactor correctly.

Using the Template

The intent is that users fork this repo, set that fork as a template, then create a new repo from their template, and lastly clone their newly created repo created from the template.

it is recommended to keep your fork of lightning-pod free of changes and synced with the lightning-pod source repo, as this ensures new features become available immediately after release

Creating an Environment

Base dependencies can be viewed in pyproject.toml.

Instructions for creating a new environment are shown below.

poetry

Install Poetry if you do not already have it installed.

cd {{ path to clone }}
poetry install
# if desired, install extras
poetry shell
pip install -r requirements/extras.txt
{{ set interpreter in IDE }}
conda

Install miniconda if you do not already have it installed.

m-series macOS users, it is recommended to use the Miniconda3 macOS Apple M1 64-bit bash installation

cd {{ path to clone }}
conda env create -f environment.yml
conda activate lightning-ai
pip install -e .
# if desired, install extras
pip install -r requirements/extras.txt
{{ set interpreter in IDE }}
venv

venv is not something that needs to be installed; it is part of Python standard.

cd {{ path to clone }}
python3 -m venv venv/
# to activate on windows
venv\Scripts\activate.bat
# to activate on macos and Unix
source venv/bin/activate
# install lightning-pod
pip install -e .
# if desired, install extras
pip install -r requirements/extras.txt
{{ set interpreter in IDE }}

Command Line Interface

A CLI pod is provided to assist with certain project tasks and to interact with Trainer. The commands for pod and their affects are shown below.

pod

pod teardown will destroy any existing data splits, saved predictions, logs, profilers, checkpoints, and ONNX.

pod trainer run runs the Trainer.

pod bug-report creates a bug report to submit issues on GitHub for Lightning. the report is printed to screen in terminal, and generated as a markdown file for easy submission.

pod seed will remove boilerplate to allow users to begin their own projects.

Files removed by pod seed:

  • cached MNIST data found in data/cache/LitDataSet
  • training splits found in data/training_split
  • saved predictions found in data/predictions
  • PyTorch Profiler logs found in logs/profiler
  • TensorBoard logs found in logs/logger
  • model checkpoints found in models/checkpoints
  • persisted ONNX model found in models/onnx

The flow for creating new checkpoints and an ONNX model from the provided encoder-decoder looks like:

pod teardown
pod trainer run

Once the new Trainer has finished, the app can be viewed by running the following in terminal:

lightning run app app.py

Deep Learning

Grant Sanderson, also known as 3blue1brown on YouTube, has provided a very useful, high level introduction to neural networks. Grant's other videos are also useful for computer and data science, and mathematics in general.

NYU's Alfredo Canziani has created a YouTube Series for his lectures on deep learning. Additionally, Professor Canziani was kind enough to make his course materials public on GitHub.

The book Dive into Deep Learning, created by a team of Amazon engineers, is availlable for free.

DeepMind has shared several lectures series created for UCL on YouTube.

OpenAI has created Spinning Up in Deep RL, an introductory series in reinforcement learning and deep learning.