docker-at-oreilly
docker-at-oreilly copied to clipboard
An overview of experiments O'Reilly Media is doing with Docker
Docker Experiments at O'Reilly Media

Docker Boston Meetup
July 30, 2014
Andrew Odewahn ([email protected])
This presentation is at http://bit.ly/docker-at-oreilly
About O'Reilly Media

- O'Reilly Media produces technical books and events
- Both "Web 2.0" and "Open Source" were terms that came from O'Reilly events
- Founded in Cambridge, MA, but now headquartered in Sebastopol, CA (north of San Francisco)
At heart, O'Reilly is a learning company
"Spreading the knowledge of innovators" -- O'Reilly exists to take the knowledge in an experts head and package it up so that other people can learn it.
The way people want to learn is changing radically.

- How do we create media products like these using our core capabilities (editorial, brand, community)
- How do we transform as media is increasingly becoming software
- Exploring Docker to help us make new kinds of media products
How do we respond to demand for IPython Notebooks
- IPython Notebooks are becoming the defacto tool in the scientific and big data science communities
- Provides authoring and execution environment for text, math, and arbitrary code (Python, Julia, R, Ruby, and more)
- Strong demand among our authors to support this format
- Plus, it's awesome
How we're using Docker to...
- help authors create them
- produce them (edit, copyedit, illustrate, index, etc)
- distribute them to make a compelling experience?
Experiment 1: Packaging the examples for Python for Data Analysis as a Docker image

- Successful book in the "Data Science Area" published in 2012
- This is a rapidly changing area
- Create a companion product as an IPython Notebook
DEMO
The key steps are
- Install boot2docker (NB: if you have an older version of Boot2Docker, here's a great article on how to upgrade)
- Set up an account on docker.com
- Expose port 8888 in Virtualbox (do this just once)
VBoxManage controlvm boot2docker-vm natpf1 "ipython-notebook,tcp,127.0.0.1,8888,,8888"
- Start boot2docker and ssh into the box
- Pull odewahn/python-data-analysis. (NB: This is a big image -- 3GB+)
sudo docker pull odewahn/python-data-analysis
- Start the container, and be sure to expose port 8888
sudo docker run -i -t -p 8888:8888 odewahn/python-data-analysis /bin/bash
- Once the container starts and your at the bash prompt, start the server with this command:
./start.sh
- Go to localhost:8888 on your local browser
How do we go beyond companion pieces and make actual products?

- Companion products are great, but how do we make actual products themselves?
- We use an internally developed tool call O'Reilly Atlas for 80% of our content.
Atlas has 3 core concepts

A single source of semantically rich content
Version control in Git

- All Atlas content is stored in Git.
- This presentation was created in Atlas and posted to Github
Transformation engines to create formats for consumption
- Print books (80% of titles published through ORM)
- EPUB
- MOBI
- Web Sites
Experiment 2: Just Enough Math

- A combination book, video series, and tutorial
- Delivered as an IPython Notebook created in Atlas
The project was written and produced in Atlas

- Code samples that are tagged as "Executable" will be runnable in the browser
An Atlas to IPython Notebook conversion gem

- The atlas2ipynb gem gem transform HTMLBook into IPython Notebook's JSON-based format
A Dockerfile for the base image with IPython Notebooks and the atlas2ipynb toolchain pre-installed
FROM ubuntu
MAINTAINER Andrew Odewahn "[email protected]"
RUN apt-get update
RUN apt-get install -y ruby1.9.3
RUN apt-get install -y python-software-properties python-dev python-pip
RUN apt-get install -y libfreetype6-dev libpng-dev libncurses5-dev vim git-core build-essential curl unzip wget
# Install Atlas-specific gems
RUN gem install bundler atlas-api atlas2ipynb
# Install ipython notebook requirements
RUN pip install --upgrade pip
ADD requirements.txt /tmp/requirements.txt
RUN pip install numpy==1.7.1
RUN pip install -r /tmp/requirements.txt --allow-unverified matplotlib --allow-all-external
#
# Create the command to actually run the ipython notebook
#
RUN adduser --disabled-password --home=/home/atlas --gecos "" atlas
USER atlas
WORKDIR /home/atlas
RUN echo '#!/bin/sh' > start.sh
RUN echo 'ipython notebook --ip=0.0.0.0 --port=8888 --pylab=inline --no-browser' >> start.sh
RUN chmod +x start.sh
#
# Set us back to the root user
#
USER root
- An atlas-base Docker image
A Dockerfile for Just Enough Math (or any book, for that matter)
FROM odewahn/atlas-base
MAINTAINER Andrew Odewahn "[email protected]"
#
# Install systemwide requirements
#
RUN apt-get install -y libatlas-base-dev
RUN apt-get install -y gfortran
RUN apt-get install -y gcc-multilib
RUN apt-get install -y lynx
RUN apt-get install -y emacs23-nox
RUN apt-get install -y glpk
RUN apt-get install -y python-glpk
#
# Install python packages using pip
#
RUN pip install scipy
RUN pip install neurolab
RUN pip install hyperloglog
RUN pip install countminsketch
RUN pip install pybloom
RUN pip install lshash
#
# Install content using atlas-api to build the project
# Be sure to set ATLAS_KEY as an environment variable!
# export ATLAS_KEY=<your atlas API key>
#
USER atlas
WORKDIR /home/atlas
RUN atlas2ipynb $ATLAS_KEY odewahn/jem-docker
"docker push" is the new publishing
docker build --tag odewahn/jem-tutorial .
docker push odewahn/jem-tutorial
DEMO
- Start boot2docker and ssh into the box
- Pull odewahn/python-data-analysis. (NB: This is a big image -- 3GB+)
sudo docker pull odewahn/jem-tutorial
- Start the container, and be sure to expose port 8888
sudo docker run -i -t -p 8888:8888 odewahn/jem-tutorial /bin/bash
- Once the container starts and you're at the bash prompt, start the server with this command:
./start.sh
- Go to localhost:8888 on your local browser
This experience leaves a lot to be desired
- One of the first projects was "Kids Code," which teaches kids about Python
- "OK kids, let's fire up an Ubuntu Virtual Machine and do some coding!" doesn't work well
- Even for pros, this is a bit intimidating
- VMs and Vagrant are unfamiliar
- Windows does not include an SSH client...
Experiment #3: Towards a more seamless experience
- O'Reilly Pyxie is a place where authors can put Docker images for distribution
- Inspired by Nick Stinemates any-sass project
- Frontend app starts a container based on an image you choose
- Container is mapped to a URL using Hipache and returned to the user
- User runs the container by going to the URL
- Super-duper pre-alpha proof of concept
DEMO -- Pyxie.io
Lots of caveats
- Scalability is a HUGE issue
- Exploring many solutions for hosting images
- Security issues in running untrusted code
- Persistence and state
- Skills -- finding people who are familiar with these tools is challenging
For more Info
- This presentation is at http://bit.ly/docker-at-oreilly
- The source is at https://github.com/odewahn/docker-at-oreilly
- Email me directly at [email protected]
- While you're there, check out my Distributed Development Field Guide
A quick Survey
Questions / Comments
