deploy-flex-actions
deploy-flex-actions copied to clipboard
Deploying flexdashboard on Github Pages with Docker and Github Actions
Deploy Flexdashboard on Github Pages with Github Actions and Docker
Work in progress, pre-spelling check...
This repo provides a step-by-step guide and a template for deploying and refreshing a flexdashboard dashboard on Github Pages with Docker and Github Actions.
TODO
- Set docker environment ✅
- Set Github Pages workflow
- Build an example dashboard
- Set automation with Github Actions
- Create documentations
Folder structure
.
├── README.md
├── dataviz_prototype.qmd
├── diagrams
│ └── deployment architect.drawio
├── docker
│ ├── Dockerfile
│ ├── build_docker.sh
│ ├── install_packages.R
│ ├── install_python.R
│ ├── install_quarto.sh
│ └── packages.json
├── docker-compose.yml
└── images
├── dash_prototype01.png
├── dash_prototype02.png
├── deployment architect.png
├── flexdashboard_example.png
├── rstudio01.png
├── rstudio02.png
├── rstudio03.png
└── wip.png
Motivation
As its name implies, the flexdashboard package provides a flexible framework for creating dashboards. It is part of the Rmarkdown ecosystem, and it has the following features:
- Simple
- Set the dashboard layout with the use of rows and columns format
- Customize the dashboard theme using CSS or the bslib package
- Use built-in widgets such as value boxes and gauges
- Create interactive (and serverless) dashboards leveraging R data visualization tools (e.g., Plotly, highcharter, dychart, leaflet, etc.), tables (gt, reactable, reactablefrm, kable, etc.), and htmlwidges tools such as crosstalk.
- Build dynamic dashboards with Shiny
This tutorial will focus on deploying flexdashboard to Github Pages and automating the dashboard data refresh with Github Actions and Docker. Github and Docker offer both enterprise and free tools. Throughout this tutorial, we will leverage the free versions.
When to use Github Actions?
Github Actions is a CI/CD tool enabling scheduling and triggering jobs (or scripts). In the context of R, here are some useful use cases:
- Package testing - Triggering R CMD Check when pushing new code (see this example)
- Data automation - Build data pipelines with Rmarkdown or pull data from APIs
- Refresh data, rerender flexdashboard and redeploy on Github Pages (see coronavirus and covid19italy packages supporting dashboards)
Why Docker?
Docker is a CI/CD tool that enables seamless code deployment from dev to prod. By creating OS-level virtualization, it can package an application and its dependencies in a virtual container. Or in other words, the code that was developed and tested in the dev env will run with the exact same env (e.g., the same OS, compilers, packages, and other dependencies) on prod. Docker can run natively on Linux systems and with Docker Desktop (or equivalent) on macOS and Windows OS.
Docker + R = ❤️❤️❤️
Docker is a great tool for automating tasks in R, in particular, when deploying R code with Github Actions (e.g., R CMD Check, Rmarkdown, Quarto, or Flexdashboard). In this tutorial, we will build a development environment and use it to build the dashboard and then leverage it to deploy it on Github Actions. There are two main approaches for developing with Docker in R:
- RStudio server
- VScode
We will cover the two and discuss the pros and cons of each approach.
Workflow
A typical workflow will include the following steps
- Scope - define the project requirements and derive dependencies
- Prototype - transform the scope into a sketch
- Dockerize - set initial development environment
- Develop - build the dashboard functionality and data visualization
- Deploy - push the dashboard to Github Pages
- Automate - build the dashboard refresh with Github Actions
Typically, you may update the Docker image throughout the development process if additional requirements (or dependencies) beyond the scope will be needed.
Dashboard scope
Create a worldwide COVID19 tracker which will include:
- Distribution of confirmed cases by continent
- Cases distribution by country:
- Confirmed
- Death
Expected dependencies:
- Dashboard - flexdashboard
- Data - coronavirus
- Data visualization - highcharter
- Utility - dplyr, tidyr, lubridate
Dashboard prototype
After setting a clear scope, I found it useful to prototype and put your thoughts on a piece of paper, drow.io, iPad, or any other tool you find useful. The goal is to translate the scope into some sketches to understand the data inputs, required transformation, type of visualization, etc. In addition, a narrow scope with a good prototype will potentially save you some time and cycles when starting to code the dashboard. That being said, you should stay open-minded to changes in the final output, as what may look nice on the sketch may turn out less appealing on the final output.
![](https://github.com/RamiKrispin/deploy-flex-actions/raw/main/images/dash_prototype01.png)
As the focus of this tutorial is on the deployment itself and not on the data visualization, we will keep the dashboard simple and create the following three plots:
- Daily new cases (either by continent or worldwide), using scatter plot with trend line
- Daily death cases (either by continent or worldwide), using scatter plot with trend line
- Distribution of cases by country using treemap plot
We will leverage the highcharter package to create those plots.
![](https://github.com/RamiKrispin/deploy-flex-actions/raw/main/images/dash_prototype02.png)
Once we have defined the scope and have a simple prototype, we better understand the dashboard requirements (e.g., data, packages, etc.), and we can start thinking about the dashboard's general architect. The below diagram describes the general workflow we will use to create the dashboard:
- Data source - coronavirus package
- Orchestration - Github Actions
- Environment - Docker
- Deployment - Github Pages
![](https://github.com/RamiKrispin/deploy-flex-actions/raw/main/images/deployment%20architect.png)
Set Docker environment
There are multiple approaches for setting a Docker environment with the Dockerfile. My approach is to minimize the Dockerfile
by using utility files and automating the process with bash
scrip. This makes the Dockerfile
cleaner, yielding a smaller image size with fewer layers. Below is the tree of the docker
folder in this tutorial:
.
├── Dockerfile
├── build_docker.sh
├── install_packages.R
├── install_python.R
├── install_quarto.sh
└── packages.json
This includes the following four files:
-
Dockerfile
- the image manifest provides a set of instructions for the docker engine about how to build the image -
build_docker.sh
- a bash script to automate the build of the image and push to Docker Hub -
install_packages.R
- an R script that installs the dependencies of the project as set in thepackages.json
file -
install_packages.R
- an R script that installs the Python dependencies of the project as set in thepackages.json
file -
install_quarto.sh
- a Bash script that install Quarto -
packages.json
- a JSON file with a list of the project packages and their version
Before diving into more details, let's review the Dockerfile
.
The Dockerfile
The Dockerfile
provides a set of instructions for the docker engine to build the image. You can think about it as the image's recipe. It has its own unique and intuitive syntax following this structure:
COMMAND some instructions
Docker can build images automatically by reading the instructions from a Dockerfile. In this tutorial, we will use the following Dockerfile
:
# Pulling Rocker image with RStudio and R version 4.2
FROM rocker/rstudio:4.2
# Setting environment variables
ARG CONDA_ENV=flex_dashboard
ENV CONDA_ENV=$CONDA_ENV
ARG PYTHON_VER=3.8
ENV PYTHON_VER=$PYTHON_VER
ARG QUARTO_VERSION=1.1.149
ENV QUARTO_VERSION=$QUARTO_VERSION
# Disabling the authentication step
ENV USER="rstudio"
CMD ["/usr/lib/rstudio-server/bin/rserver", "--server-daemonize", "0", "--auth-none", "1"]
# Install jq to parse json files
RUN apt-get update && apt-get install -y --no-install-recommends \
jq \
libxml2-dev \
zlib1g \
g++-11 \
libz-dev \
freetype2-demos \
libpng-dev \
libtiff-dev \
libjpeg-dev \
make \
fontconfig \
libfribidi-dev \
libharfbuzz-dev \
libfontconfig1-dev \
&& rm -rf /var/lib/apt/lists/*
# installing R packages
RUN mkdir packages
COPY install_packages.R packages/
COPY install_python.R packages/
COPY packages.json packages/
RUN Rscript packages/install_packages.R
# Installing Quarto
COPY install_quarto.sh packages/
RUN bash packages/install_quarto.sh $QUARTO_VERSION
# Install miniconda
RUN sudo apt update && apt-get install -y --no-install-recommends \
software-properties-common \
&& sudo add-apt-repository -y ppa:deadsnakes/ppa \
&& sudo apt update
RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh \
&& /bin/bash ~/miniconda.sh -b -p /opt/conda \
&& export PATH=/opt/conda/bin:$PATH \
&& conda init bash \
&& conda install conda-build
# Set environment
RUN . /root/.bashrc \
&& conda create -y --name $CONDA_ENV python=$PYTHON_VER
RUN echo "conda activate $CONDA_ENV" >> ~/.bashrc
RUN Rscript packages/install_python.R
EXPOSE 8787
The above Dockerfile has the following components:
-
Base image - We will use the rocker/rstudio:4.2` image as the base image for this project. This image contains R version 4.2.0 and the RStudio server installed and will be used as the development environment.
-
Setting arguments and environment variables - We will use the
ARG
argument to set the build arguments and theENV
argument to set the environment variables. The main distinction between theARG
and theENV
arguments:- The
ARG
argument enables the user to assign variables dynamically during the build time. - The
ENV
argument enables to set of environment variables that are available both during the build and run time (as opposed toARG
s variables that are available only during the build time). - One way to set dynamic environment variables is to set them first as arguments and then assign them to environment variables. Not sure if this is the best practice, but it is a convenience to have some of the arguments available after the build time, mainly for debugging.
To run R in VScode, we will use radian, an alternative R console with multiline editing and rich syntax highlight. The radian is Python based therefore, we will set Conda to set a Python environment. The
CONDA_ENV
andPYTHON_VER
will be used to set the Conda environment and its version. In addition, we will use theQUARTO_VERSION
argument to set the Quarto version. - The
-
Disabling the authentication - By default, the RStudio server requires a user name and password. We will use the
ENV
command to define the environment variableUSER
and set it asrstudio
and theCMD
command to disable the authentication step. -
Installing Dependencies - Generally, rocker images will have most of the Debian packages, C/C++ compliers, and other dependencies. However, often you may need to install additional requirements based on the packages you add to the image. In our case, we will use the
RUN
command to install jq, a command line tool for parsingJSON
files, and the libxml2 Debian package that is required to install the lubridate package. -
Installing the R packages - To install additional R packages, we will make a new directory inside the image called
packages
and copy theinstall_packages.R
andpackages.json
files that will be used to install the required R packages. -
Install Quarto - We will use the
install_quarto.sh
bash script to install Quarto. We will use theQUARTO_VERSION
argument to set the version (1.1.149). -
Set Python Environment - We will set Python environment with Conda and than install radian. We will set Python environment with Conda and then install radian. This step requires if you are planning to develop with VScode (or down the road when flexdashboard will be available for Python, Julia, and Observable with Quatro)
-
Expose port - Last but not least, we will use the
EXPOSE
command to expose port 8787 (default) for the RStudio server (as set on the base docker).
We will define all required packages and their versions on the packages.json
file:
{
"packages": [
{
"package": "cpp11",
"version":"0.4.2"
},
{
"package": "flexdashboard",
"version":"0.5.2"
},
{
"package": "dplyr",
"version":"1.0.9"
},
{
"package": "tidyr",
"version":"1.2.0"
},
{
"package": "highcharter",
"version":"0.9.4"
},
{
"package": "readr",
"version":"2.1.2"
},
{
"package": "coronavirus",
"version":"0.3.32"
},
{
"package": "lubridate",
"version":"1.8.0"
},
{
"package": "languageserver",
"version":"0.3.13"
}
],
"debug_mode": [
{
"package": "ragg",
"version":"1.2.2"
}
],
"python_packages": [
{
"package": "radian",
"version": "0.6.3"
}
]
}
Note that the JSON file has three sections:
-
packages
- Defines the R packages to install from CRAN (or CRAN archive) -
debug
- used when some packages "refused" to get installed and enables a quick debugging mode to track the error (or identify which Debian packages are missing...) -
python_packages
- defines the Python packages to install on the conda environment
To build the Docker image, we will use build_docker.sh
file, which builds and push the image to Docker Hub:
#!/bin/bash
echo "Build the docker"
docker build . --progress=plain \
--build-arg QUARTO_VERSION=1.1.149 \
--build-arg CONDA_ENV=flex_dashboard \
--build-arg PYTHON_VER=3.8 \
-t rkrispin/flex_dash_env:dev.0.0.0.9000
if [[ $? = 0 ]] ; then
echo "Pushing docker..."
docker push rkrispin/flex_dash_env:dev.0.0.0.9000
else
echo "Docker build failed"
fi
This bash
script simply builds the docker and tags it as rkrispin/flex_dash_env:dev.0.0.0.9000
, and then, if the build was successful, push it to Docker Hub. As you can see on the docker build
command, we are using the --build-arg
argument to define the Dockerfile
arguments. To execute this script from the command line:
bash build_docker.sh
Lunching the development environment
There are multiple methods to spin a docker image into a running containter. Before going to the robust method using the docker-compose
, let's review the basic method with the run
command:
docker run -d -p 8787:8787 rkrispin/flex_dash_env:dev.0.0.0.9000
The docker run
command (or run
in short) enables you to launch a container. In the above example, we used the following arguments:
-
-d
(or detach mode) to run the container in the background and -
-p
argument maps between the container and the local machine ports, where the right to the:
symbol represents the port that is exposed on the container and the one on the left represents the port on the local machine. In the above example, we mapped port 8787 on the docker to port 8787 on the local machine
We close the run
command with the name of the image we want to launch.
Note: If you got the following error, check if your Docker desktop is open:
docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?.
See 'docker run --help'.
If the image is unavailable locally, it will try to pull it from the default hub (make sure you logged in, it might take a few minutes to download it). If the image was successfully launched, it should return the container ID, for example:
ac26ec61e71bc570a2ed769ba2b0dbef964d536f7d7cc51b61ea3e8542953cb1
You can use the docker ps
command to check if the image is running:
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ac26ec61e71b rkrispin/flex_dash_env:dev.0.0.0.9000 "/usr/lib/rstudio-se…" 4 minutes ago Up 4 minutes 0.0.0.0:8787->8787/tcp sweet_elion
Now you can go to your browser and use http://localhost:8787
to access the Rstudio server from the browser:
![](https://github.com/RamiKrispin/deploy-flex-actions/raw/main/images/rstudio01.png)
Does it sufficient to start developing our dashboard? The answer is NO!
We have a functional environment, yet we are still missing a couple of elements to make this container fully functional as a development environment. For example, although we can access the container from the browser, it is still an isolated environment as we can't save or commit changes in the code. Let's add the -v
argument to mount a local volume with the container. This will enable you to work inside the container and read and write data from your local machine. If the container is already running, use the docker kill
(yes, not the best wording for a command...) following by the container ID (see the docker ps
output for the container ID) to stop the running containers:
docker kill ac26ec61e71b
Let's repeat the previous command and add the -v
argument to mount the container to your local folder:
docker run -d -p 8787:8787 -v $TUTORIAL_WORKING_DIR:/home/rstudio/flexdash rkrispin/flex_dash_env:dev.0.0.0.9000
You can see now, after applying and refreshing the container, that the flexdash
folder (marked with a green rectangle) is now available inside the container:
![](https://github.com/RamiKrispin/deploy-flex-actions/raw/main/images/rstudio02.png)
Note that $TUTORIAL_WORKING_DIR
is the environment variable that I set with the local folder path on my machine, and /home/rstudio/
is the root folder on the container, and flexdash
is the name of the mounted folder inside the container. To run it on your local machine, you should modify in the following example YOUR_LOCAL_PATH
with your folder local path and FOLDER_NAME
with the name you want to use for this mount volume inside the container:
docker run -d -p 8787:8787 -v YOUR_LOCAL_PATH:/home/rstudio/FOLDER_NAME rkrispin/flex_dash_env:dev.0.0.0.9000
Does it sufficent to start develop our dashboard? Technicly, yes, we can now develop and text our code inside the container and save the changes on the local folder (and commit the changes with git
). But before we continue, let's mount our local RStudio config file with the one on the container. This will mirror your local RStudio setting to the RStudio server running inside the container:
docker run -d -p 8787:8787 \
-v YOUR_LOCAL_PATH:/home/rstudio/FOLDER_NAME \
-v $RSTUDIO_CONFIG_PATH:/home/rstudio/.config/rstudio \
rkrispin/flex_dash_env:dev.0.0.0.9000
Now, I have inside the container the same setting (e.g., color theme, code snippets, etc.):
![](https://github.com/RamiKrispin/deploy-flex-actions/raw/main/images/rstudio03.png)
Note: Your local R setting file should be, by default, under your root folder, for example, the path on my machine - /Users/ramikrispin/.config/rstudio
.
As you add more elements to the docker run
, it becomes convoluted to run it each time you want to spin the container. The docker-compose
command provides a more concise method to launch a docker container using the docker-compose.yml
file to set the docker run arguments and use the docker-compose up
command to launch to the container (and docker-compose down
to turn it off). Following the above example, here is how we customize those options with docker-compose
:
docker-compose.yml
:
version: "3.9"
services:
rstudio:
image: "$FLEX_IMAGE"
ports:
- "8787:8787"
volumes:
- type: "bind"
source: "$TUTORIAL_WORKING_DIR"
target: "/home/rstudio"
- type: "bind"
source: "$RSTUDIO_CONFIG_PATH"
target: "/home/rstudio/.config/rstudio"
Once you understand how docker run
is working, it is straightforward to understand, set, and modify the above docker-compose.yml
file according to your needs. As before, we set the image, ports, and volumes in the corresponding sections of the yaml
file. Note that I am using three environment variables to set the docker image (FELX_IMAGE
), the local folder to mount (TUTORIAL_WORKING_DIR
), and the RStudio config file (RSTUDIO_CONFIG_PATH
). Typically, this file is saved on the project/repository root folder. To launch the docker, from the path of the file, run on the command line:
docker-compose up -d
Like before, we added the detach argument -d
to keep the terminal free after launching the container. When you are done with the container, you can turn it off by using:
docker-compose down
Castumize the image
If the above packages (in the packages.json
file) meet your requirements, then you are good to go and start to develop (with minimal effort in setting your global environment variables). If you have additional or different requirements, you can update the packages.json
file according to your environment requirements and re-build the docker image using the build_docker.sh
file. The only caveat for this is that for some packages, you may need to install additional Debian packages and may need to update the Dockerfile
accordingly.
Setting VScode environment
TODO...
Prototype the dashboard data visualization
I found it useful, before starting to build the dashboard to prototype the data visualization on Rmarkdown (and now on Quarto) document. From this step and moving forward, all the dashboard development and testing will be inside the development container, either with RStudio Server or VScode.