[WORK IN PROGRESS] allennlp-manager

Your manager for AllenNLP experiments.

Motivation
Road map :car:
Dependencies
Installation
Quick start :rocket:
Configuration
Contributing

Motivation

The goal of this project is to build a customizable CLI and dashboard for running, queueing, tracking, and comparing experiments.

This was inspired by other open source projects such as the resource manager slurm and visualization toolkit TensorBoard, as well as commercial software such as Weights & Biases and Foundations Atlas.

slurm and TensorBoard are both excellent tools, but they fall short for NLP researchers in a number of ways. For example, slurm is difficult to set up and use - especially on your own desktop or server - unless you're an experienced sys admin, and TensorBoard has limited functionality for searching, organizing, tagging, and comparing models. This doesn't scale well when you have hundreds or even thousands of experiments. And while the commercial options are fairly easy to use and come with a solid set of features, they were built as generic tools and therefore don't "understand" all of AllenNLP's features. They are also not customizable or extendable.

allennlp-manager aims to leverage all of the convenient pieces of AllenNLP to provide you with a dashboard that let's you

quickly search through all of your experiments based on properties like model type, training / validation set, or arbitrary tags,
visualize the metrics from training runs of an experiment,
compare experiments in a number of ways, such as looking at a git diff of configuration files,
and easily extend it by adding your own interactive pages.

In addition to the dashboard, there is a multi-purpose CLI with commands for serving the dashboard, updating to the latest version, and programmatically submitting training runs.

Road map

For the first release I intend to have all of the features implemented except for, possibly, the slurm-like resource manager and job queueing system, as that may become quite complex. To keep up with the progress check out the Initial Release project board.

Dependencies

AllenNLP and Python 3.6 or 3.7.

Installation

pip install 'git+git://github.com/epwalsh/allennlp-manager.git#egg=mallennlp'

Quick start

Create a new project named my-project:

mallennlp new my-project && cd my-project

Then edit the Project.toml file to your liking and start the server:

mallennlp serve

Configuration

A project is customized through the Project.toml file in the root directory of the project. There is a section [project] for general options such as the log level (which applies to both the CLI and the dashboard) and a [server] section for dashboard-specific options such as the host port to bind to.

For convenience, you can open the configuration file quickly with the command mallennlp edit.

Advanced configuration

Adding custom pages

Dashboard pages are just registered subclasses of mallennlp.dashboard.page.Page, which is an AllenNLP Registrable. Therefore you can easily add more pages to the dashboard by registering your own Page implementations. The registered name of a page corresponds to its URL route. For example, the home page is registered under the name "/" and the system info page is registered under the name "/sys-info". At a bare minimum, a custom Page just needs to implement Page.get_elements(self), which renders the layout of the page. This can return anything that Dash can render, such as basic types as well as any Dash components (such as HTML Components or Core Components). For more information check out the Dash Tutorial.

Here's how you would add a page that just says "Hello, World!" in the body:

# hello_world/__init__.py
from mallennlp.dashboard.page import Page

@Page.register("/hello-world")
class HelloWorld(Page):
    requires_login = True
    navlink_name = "Hello, World!"

    def get_elements(self):
        return ["Hello, World!"]

You can put the hello_world module in the root of your project directory, or just make sure it's in your PYTHONPATH. Then add imports = ['hello_world'] under the [server] section of the Project.toml configuration file. Now you should see a link "Hello, World!" to your page in the dropdown menu.

Interactive custom pages

Page instances have two attributes, an arbitrary SessionState object (self.s) and a Params object (self.p) that holds any typed URL parameters for the page, if they have been defined. By default the SessionState and Params object don't have any attributes. Overriding these with a custom SessionState or Params object looks like this:

from mallennlp.dashboard.page import Page
from mallennlp.services.serde import serde

@Page.register("/hello-world")
class HelloWorld(Page):

    @serde
    class SessionState:
        name: str = "World!"

    @serde
    class Params:
        initial_message: str = "Hello, World!"

    # ... snip ...

Both SessionState and Params need to be serializable, which is ensured by the @serde decorator. The decorator is really just a wrapper around attr.s.

Your page then becomes interactive when you implement a callback method for any input components that were created in Page.get_elements. Page callbacks are defined by decorating a Page method with @Page.callback. Under the hood, callbacks are just Dash callbacks with some magic behind the scenes that makes the function into an instance method of your page.

Combining these concepts, we can easily add to our HelloWorld to make it interactive:

# hello_world/__init__.py
#
# The page will render a different initial message based on the URL parameter
# 'initial_message' and then update the message when the user types into the text input
# and uses the buttons.

from dash.exceptions import PreventUpdate
from dash.dependencies import Input, Output, State
import dash_bootstrap_components as dbc
import dash_html_components as html

from mallennlp.dashboard.page import Page
from mallennlp.services.serde import serde


@Page.register("/hello-world")
class HelloWorld(Page):
    requires_login = True
    navlink_name = "Hello, World!"

    @serde
    class SessionState:
        name: str = "World!"

    @serde
    class Params:
        initial_message: str = "Hello, World!"

    def get_elements(self):
        return [
            dbc.Input(
                placeholder="Enter your name", type="text", id="hello-name-input"
            ),
            html.Br(),
            dbc.Button("Save", id="hello-name-save", color="primary"),
            html.Br(),
            dbc.Button("Say hello", id="hello-name-trigger-output", color="primary"),
            html.Br(),
            html.Div(id="hello-name-output", children=self.p.initial_message),
        ]

    @Page.callback(
        [],
        [Input("hello-name-save", "n_clicks")],
        [State("hello-name-input", "value")],
        mutating=True,  # callback mutates the state.
    )
    def save_name(self, n_clicks, value):
        if not n_clicks or not value:
            raise PreventUpdate
        self.s.name = value  # update SessionState

    @Page.callback(
        [Output("hello-name-output", "children")],
        [Input("hello-name-trigger-output", "n_clicks")],
        mutating=False,  # callback doesn't mutate state.
    )
    def render_hello_output(self, n_clicks):
        if not n_clicks:
            raise PreventUpdate
        return f"Hello, {self.s.name}!"

Command completion

Since the CLI is implemented using Click, setting up completion for Bash or ZSH is easy. For example, you can just add

eval "$(_MALLENNLP_COMPLETE=source mallennlp)"

to your .bashrc. Note however that it is better to use the activation script approach instead, otherwise your shell may take a couple seconds to start.

For potential contributors

I chose to implement this project entirely in Python to make it as easy possible for anyone to contribute, since if you are using AllenNLP you must already be familiar with Python. The dashboard is built with plotly Dash, which is kind of like Python's version of Shiny if you're familiar with R.

The continuous integration for allennlp-manager is a lot like that of AllenNLP. Unit tests are run with pytest, code is type-checked with mypy, linted with flake8, and formatted with black. You can run all of the CI-steps locally with make test.

If this is your first time contributing to a project on GitHub, please see this Gist for an example workflow.

allennlp-manager
allennlp-manager copied to clipboard

Metadata

[WORK IN PROGRESS] allennlp-manager

Table of contents

Motivation

Road map

Dependencies

Installation

Quick start

Configuration

Advanced configuration

Adding custom pages

Interactive custom pages

Command completion

For potential contributors

← Metadata

Owner

Metadata

allennlp-manager allennlp-manager copied to clipboard

Metadata

[WORK IN PROGRESS] allennlp-manager

Table of contents

Motivation

Road map

Dependencies

Installation

Quick start

Configuration

Advanced configuration

Adding custom pages

Interactive custom pages

Command completion

For potential contributors

← Metadata

Owner

Metadata

allennlp-manager
allennlp-manager copied to clipboard