BlockSci icon indicating copy to clipboard operation
BlockSci copied to clipboard

Dockerfile for Reproducibility (and Binder)

Open westurner opened this issue 8 years ago • 2 comments

A Dockerfile to build and host these notebooks would be helpful.

There are cookiecutter templates for creating git repos w/ .gitignore and a Makefile and ... for these types of projects:

  • http://cookiecutter.readthedocs.io/en/latest/readme.html#reproducible-science
  • http://cookiecutter.readthedocs.io/en/latest/readme.html#data-science

There are also Docker containers which make it easy to launch a complete, consistent software environment:

  • https://github.com/Kaggle/docker-python
    • https://github.com/Kaggle/docker-python/blob/master/Dockerfile
  • https://github.com/jupyter/docker-stacks

Data Visualization w/ Jupyter Notebooks:

  • nbgrid is one way to review DataFrames with a GUI widget in a Jupyter Notebook. https://github.com/quantopian/qgrid

Python & Jupyter resources:

  • https://github.com/quobit/awesome-python-in-education/#data-science
  • https://github.com/quobit/awesome-python-in-education/#jupyter
    • binder makes reproducibility with Git, Docker, and JupyterHub really easy:
      • Src: https://github.com/jupyterhub/binderhub
      • Docs: https://binderhub.readthedocs.io/en/latest/
    • JupyterHub makes hosting Jupyter Notebook instances (with e.g. GitHub Auth) within Docker containers managed by Kubernetes very easy.

westurner avatar Oct 27 '17 03:10 westurner

either my docker machine has very little ram so this fails to compile or Ive done something wrong. Nevertheless here's a Dockerfile i quickly scripted earlier today

FROM ubuntu:bionic
LABEL maintainer="Haaroon Yousaf (h.yousaf [at] ucl.ac.uk)"
RUN apt-get update && apt-get install -y software-properties-common python3-software-properties
RUN add-apt-repository ppa:ubuntu-toolchain-r/test -y && apt-get update
RUN apt install -y cmake libtool autoconf libboost-filesystem-dev libboost-iostreams-dev \
libboost-serialization-dev libboost-thread-dev libboost-test-dev  libssl-dev libjsoncpp-dev \
libcurl4-openssl-dev libjsoncpp-dev libjsonrpccpp-dev libsnappy-dev zlib1g-dev libbz2-dev \
liblz4-dev libzstd-dev libjemalloc-dev libsparsehash-dev python3-dev python3-pip git gcc-7 \
clang-5.0 g++-7 c++17
RUN pip3 install matplotlib numpy pandas jupyter jupyter-core
WORKDIR /root/
RUN git clone https://github.com/citp/BlockSci.git
WORKDIR /root/BlockSci
RUN mkdir -p /root/BlockSci/release
RUN mkdir -p /root/data
WORKDIR /root/BlockSci/release
RUN CC=gcc-7 CXX=g++-7 cmake -DCMAKE_BUILD_TYPE=Release ..
RUN make && make install
WORKDIR /root/BlockSci/
RUN CC=gcc-7 CXX=g++-7 pip3 install -e blockscipy
WORKDIR /root/BlockSci/Notebooks
EXPOSE 8888
VOLUME ["/root/data"]
CMD jupyter notebook && bash

Haaroon avatar May 29 '18 22:05 Haaroon

Here's my docker file and instructions: GitHub There is a dockerfile with all the little problems I came across the way. It might change down the line. Hope it helps somebody. And if for any reason you build it on Windows: Don't. Forget. To. Assign. Ressources. And also watch out for file permissions. Or do it in Linux right away :)

priordice avatar Jan 22 '19 02:01 priordice