lightning-lab
lightning-lab copied to clipboard
hackable boilerplate for PyTorch Lightning driven deep learning research
Overview
Lightning Pod is a template Python environment, tooling, and system architecture for artificial intelligence and machine learning projects that use PyTorch and PyTorch Lightning. Lightning Pod also offers an example Plotly Dash app.
Core AI/ML Ecosystem
These are the base frameworks. Many other tools (numpy, pyarrow etc) are installed as dependencies when installing the core dependencies.
- pytorch
- pytorch-lightning
- torchmetrics
- weights and biases
- optuna
- hydra
- plotly
- dash
Testing and Code Quality
- PyTest
- coverage
- MyPy
- Black
- isort
- pre-commit
Packaging
- setuptools
- build
- twine
- poetry
CI/CD
- CircleCI
- Deepsource
- GitHub Actions
- Mergify
Core Code
lightning_pod.core contains code for LightningModule and and the Trainer.
lightning_pod.pipeline contains code for data preprocessing, building a Torch Dataset, and LightningDataModule.
If you only need to process data and implement an algorithm from a paper or pseudcode, you can focus on lightning_pod.core and lightning_pod.pipeline and ignore the rest of the code, so long as you follow the basic class and function naming conventions I've provided.
Altering the naming conventions will cause the flow to break. Be sure to refactor correctly.
Using the Template
The intent is that users fork this repo, set that fork as a template, then create a new repo from their template, and lastly clone their newly created repo created from the template.
it is recommended to keep your fork of lightning-pod free of changes and synced with the lightning-pod source repo, as this ensures new features become available immediately after release
Creating an Environment
Base dependencies can be viewed in pyproject.toml.
Instructions for creating a new environment are shown below.
poetry
Install Poetry if you do not already have it installed.
cd {{ path to clone }}
poetry install
# if desired, install extras
poetry shell
pip install -r requirements/extras.txt
{{ set interpreter in IDE }}
conda
Install miniconda if you do not already have it installed.
m-series macOS users, it is recommended to use the
Miniconda3 macOS Apple M1 64-bit bashinstallation
cd {{ path to clone }}
conda env create -f environment.yml
conda activate lightning-ai
pip install -e .
# if desired, install extras
pip install -r requirements/extras.txt
{{ set interpreter in IDE }}
venv
venv is not something that needs to be installed; it is part of Python standard.
cd {{ path to clone }}
python3 -m venv venv/
# to activate on windows
venv\Scripts\activate.bat
# to activate on macos and Unix
source venv/bin/activate
# install lightning-pod
pip install -e .
# if desired, install extras
pip install -r requirements/extras.txt
{{ set interpreter in IDE }}
Command Line Interface
A CLI pod is provided to assist with certain project tasks and to interact with Trainer. The commands for pod and their affects are shown below.
pod
pod teardown will destroy any existing data splits, saved predictions, logs, profilers, checkpoints, and ONNX.
pod trainer run runs the Trainer.
pod bug-report creates a bug report to submit issues on GitHub for Lightning. the report is printed to screen in terminal, and generated as a markdown file for easy submission.
pod seed will remove boilerplate to allow users to begin their own projects.
Files removed by pod seed:
- cached MNIST data found in
data/cache/LitDataSet - training splits found in
data/training_split - saved predictions found in
data/predictions - PyTorch Profiler logs found in
logs/profiler - TensorBoard logs found in
logs/logger - model checkpoints found in
models/checkpoints - persisted ONNX model found in
models/onnx
The flow for creating new checkpoints and an ONNX model from the provided encoder-decoder looks like:
pod teardown
pod trainer run
Once the new Trainer has finished, the app can be viewed by running the following in terminal:
lightning run app app.py
Deep Learning
Grant Sanderson, also known as 3blue1brown on YouTube, has provided a very useful, high level introduction to neural networks. Grant's other videos are also useful for computer and data science, and mathematics in general.
NYU's Alfredo Canziani has created a YouTube Series for his lectures on deep learning. Additionally, Professor Canziani was kind enough to make his course materials public on GitHub.
The book Dive into Deep Learning, created by a team of Amazon engineers, is availlable for free.
DeepMind has shared several lectures series created for UCL on YouTube.
OpenAI has created Spinning Up in Deep RL, an introductory series in reinforcement learning and deep learning.