allennlp-manager
allennlp-manager copied to clipboard
[WORK IN PROGRESS] Your manager for AllenNLP experiments
[WORK IN PROGRESS] allennlp-manager
Your manager for AllenNLP experiments.
Table of contents
- Motivation
- Road map :car:
- Dependencies
- Installation
- Quick start :rocket:
- Configuration
- Contributing
Motivation
The goal of this project is to build a customizable CLI and dashboard for running, queueing, tracking, and comparing experiments.
This was inspired by other open source projects such as the resource manager slurm and visualization toolkit TensorBoard, as well as commercial software such as Weights & Biases and Foundations Atlas.
slurm and TensorBoard are both excellent tools, but they fall short for NLP researchers in a number of ways. For example, slurm is difficult to set up and use - especially on your own desktop or server - unless you're an experienced sys admin, and TensorBoard has limited functionality for searching, organizing, tagging, and comparing models. This doesn't scale well when you have hundreds or even thousands of experiments. And while the commercial options are fairly easy to use and come with a solid set of features, they were built as generic tools and therefore don't "understand" all of AllenNLP's features. They are also not customizable or extendable.
allennlp-manager aims to leverage all of the convenient pieces of AllenNLP to provide you with a dashboard that let's you
- quickly search through all of your experiments based on properties like model type, training / validation set, or arbitrary tags,
- visualize the metrics from training runs of an experiment,
- compare experiments in a number of ways, such as looking at a git diff of configuration files,
- and easily extend it by adding your own interactive pages.
In addition to the dashboard, there is a multi-purpose CLI with commands for serving the dashboard, updating to the latest version, and programmatically submitting training runs.
Road map
For the first release I intend to have all of the features implemented except for, possibly, the slurm-like resource manager and job queueing system, as that may become quite complex. To keep up with the progress check out the Initial Release project board.
Dependencies
AllenNLP and Python 3.6 or 3.7.
Installation
pip install 'git+git://github.com/epwalsh/allennlp-manager.git#egg=mallennlp'
Quick start
Create a new project named my-project
:
mallennlp new my-project && cd my-project
Then edit the Project.toml
file to your liking and start the server:
mallennlp serve
Configuration
A project is customized through the Project.toml
file in the root directory of the project. There is a section [project]
for general options such as the log level (which applies to both the CLI and the dashboard) and a [server]
section for dashboard-specific options such as the host port to bind to.
For convenience, you can open the configuration file quickly with the command mallennlp edit
.
Advanced configuration
Adding custom pages
Dashboard pages are just registered subclasses of mallennlp.dashboard.page.Page
, which is an AllenNLP Registrable
. Therefore you can easily add more pages to the dashboard by registering your own Page
implementations. The registered name of a page corresponds to its URL route. For example, the home page is registered under the name "/" and the system info page is registered under the name "/sys-info". At a bare minimum, a custom Page
just needs to implement Page.get_elements(self)
, which renders the layout of the page. This can return anything that Dash
can render, such as basic types as well as any Dash
components (such as HTML Components or Core Components). For more information check out the Dash Tutorial.
Here's how you would add a page that just says "Hello, World!" in the body:
# hello_world/__init__.py
from mallennlp.dashboard.page import Page
@Page.register("/hello-world")
class HelloWorld(Page):
requires_login = True
navlink_name = "Hello, World!"
def get_elements(self):
return ["Hello, World!"]
You can put the hello_world
module in the root of your project directory, or just make sure it's in your PYTHONPATH
. Then add imports = ['hello_world']
under the [server]
section of the Project.toml
configuration file. Now you should see a link "Hello, World!" to your page in the dropdown menu.
Interactive custom pages
Page
instances have two attributes, an arbitrary SessionState
object (self.s
) and a Params
object (self.p
) that holds any typed URL parameters for the page, if they have been defined. By default the SessionState
and Params
object don't have any attributes. Overriding these with a custom SessionState
or Params
object looks like this:
from mallennlp.dashboard.page import Page
from mallennlp.services.serde import serde
@Page.register("/hello-world")
class HelloWorld(Page):
@serde
class SessionState:
name: str = "World!"
@serde
class Params:
initial_message: str = "Hello, World!"
# ... snip ...
Both SessionState
and Params
need to be serializable, which is ensured by the @serde
decorator. The decorator is really just a wrapper around attr.s
.
Your page then becomes interactive when you implement a callback method for any input components that were created in Page.get_elements
. Page callbacks are defined by decorating a Page
method with @Page.callback
. Under the hood, callbacks are just Dash callbacks with some magic behind the scenes that makes the function into an instance method of your page.
Combining these concepts, we can easily add to our HelloWorld
to make it interactive:
# hello_world/__init__.py
#
# The page will render a different initial message based on the URL parameter
# 'initial_message' and then update the message when the user types into the text input
# and uses the buttons.
from dash.exceptions import PreventUpdate
from dash.dependencies import Input, Output, State
import dash_bootstrap_components as dbc
import dash_html_components as html
from mallennlp.dashboard.page import Page
from mallennlp.services.serde import serde
@Page.register("/hello-world")
class HelloWorld(Page):
requires_login = True
navlink_name = "Hello, World!"
@serde
class SessionState:
name: str = "World!"
@serde
class Params:
initial_message: str = "Hello, World!"
def get_elements(self):
return [
dbc.Input(
placeholder="Enter your name", type="text", id="hello-name-input"
),
html.Br(),
dbc.Button("Save", id="hello-name-save", color="primary"),
html.Br(),
dbc.Button("Say hello", id="hello-name-trigger-output", color="primary"),
html.Br(),
html.Div(id="hello-name-output", children=self.p.initial_message),
]
@Page.callback(
[],
[Input("hello-name-save", "n_clicks")],
[State("hello-name-input", "value")],
mutating=True, # callback mutates the state.
)
def save_name(self, n_clicks, value):
if not n_clicks or not value:
raise PreventUpdate
self.s.name = value # update SessionState
@Page.callback(
[Output("hello-name-output", "children")],
[Input("hello-name-trigger-output", "n_clicks")],
mutating=False, # callback doesn't mutate state.
)
def render_hello_output(self, n_clicks):
if not n_clicks:
raise PreventUpdate
return f"Hello, {self.s.name}!"
Command completion
Since the CLI is implemented using Click, setting up completion for Bash or ZSH is easy. For example, you can just add
eval "$(_MALLENNLP_COMPLETE=source mallennlp)"
to your .bashrc
. Note however that it is better to use the activation script approach instead, otherwise your shell may take a couple seconds to start.
For potential contributors
I chose to implement this project entirely in Python to make it as easy possible for anyone to contribute, since if you are using AllenNLP you must already be familiar with Python. The dashboard is built with plotly Dash, which is kind of like Python's version of Shiny if you're familiar with R.
The continuous integration for allennlp-manager is a lot like that of AllenNLP. Unit tests are run with pytest, code is type-checked with mypy, linted with flake8, and formatted with black. You can run all of the CI-steps locally with make test
.
If this is your first time contributing to a project on GitHub, please see this Gist for an example workflow.