paperbots Make Paperbots properly CT/CI-able

Oct 26 '18 11:10 badlogic

How do you deploy the application at the moment? (you mentioned docker-compose in the PR) I was not able to find any Dockerfiles in the repository.

A setup I use for webapps at the moment is one container based on nginx for serving all static content, and a second one running the java based webservice.

In addition I could imagine to use git flow for organising the deployments. e.g. every push to master deploys the "production" environment while develop is kept in sync with a staging environment.

Automating the deployments should not be that hard. (using travis/dockerhub)

Nov 03 '18 14:11 mbrugger

The Dockerfiles aren't in the repository as they contain some passwords (yes, I know...). With one of the recent commits, that has almost changed, save for the MySQL credentials, for which I need to find a solution. Once I've resolved that, I'll add a docker/ folder here with the docker-compose.yml along with the Dockerfile for the java server.

The way things are currently setup:

I have a Hetzner server that serves all my websites.
Each website is a set of containers defined via docker-compose.
In front of all those containers sits a single Nginx container that a) makes sure SSL certs are created and renewed and b) forwards to the relevant container based on the incoming request domain.
The Paperbots containers are
1. An Nginx container that logs requests and errors to files in a persistent volume, and forwards all requests to the Java server (including requests for static files).
2. A MySQL container. The data files are mapped to a persistent volume.
3. The Java server container, which is stateless. The configuration of the server (email credentials, db credentials), comes from a .json file mapped into the container.

Here's the docker-compose.yml for the entire site, with MySQL credentials redacted.

version: "3"
services:
    web:
        image: nginx:1.13.12
        container_name: paperbots_nginx
        restart: always
        volumes:
            - ./nginx.conf:/etc/nginx/conf.d/default.conf
            - ./data/web:/www
            - ./data/logs:/logs
        environment:
            VIRTUAL_HOST: paperbots.io,www.paperbots.io
            LETSENCRYPT_HOST: paperbots.io,www.paperbots.io
            LETSENCRYPT_EMAIL: "[email protected]"
    site:
        build:
            dockerfile: Dockerfile.site
            context: .
        container_name: paperbots_site
        restart: always
        volumes:
            - ./paperbots.json:/etc/paperbots/paperbots.json
        environment:
            PAPERBOTS_CONFIG: /etc/paperbots/paperbots.json
    mysql:
        image: mysql:5.7.22
        container_name: paperbots_mysql
        restart: always
        environment:
            - MYSQL_ROOT_PASSWORD=XXXXXX
            - MYSQL_DATABASE=paperbots
        volumes:
            - ./data/mysql:/var/lib/mysql
networks:
    default:
        external:
            name: nginx-proxy

The Nginx config simply forwards to the Java server.

server {
    listen 80;
    index index.php index.html;
    server_name www.paperbots.io paperbots.io;
    error_log  /logs/error.log;
    access_log /logs/access.log;
    root /www;

    # Let the nginx-proxy gives us the
    # real ip, see https://github.com/jwilder/nginx-proxy/issues/130
    real_ip_header X-Forwarded-For;
    real_ip_recursive on;
    set_real_ip_from 0.0.0.0/0;

    # Website requests go to the Java app
    # which serves all assets
    location / {
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Host    $host:$server_port;
        proxy_set_header X-Forwarded-Server  $host;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_pass http://site:8001;
    }
}

For the Java server an image is build using the following Dockerfile:

FROM openjdk:slim

WORKDIR /

# Curl, git, etc
RUN apt-get update && apt-get -y --force-yes install git maven && \
 	git clone https://github.com/badlogic/paperbots && \
	cd /paperbots/server && mvn clean package -Dmaven.test.skip=true -DskipTests

CMD cd /paperbots/server && ./start.sh

This clones the repo, builds a single .jar, then starts the server by invoking the start.sh script.

If you feel like screaming, you can have a look at the start.sh script. It's an endless loop that:

Pulls in the latest changes from Git
Compiles the server .jar
Executes the .jar passing it the config file and the location of the static files.

You might wonder why I build the .jar as part of the Dockerfile.site. Welp, that will cache the $HOME/.m2 local Maven repository. Subsequent builds via the start.sh file only pull in artifacts that were added to the server's dependencies, or artifacts for which I changed the version in the pom.

This frankenstein the "runs" and serves all request and runs as long as I do not need to change any of the Docker related configuration.

To deploy changes to the frontend code, I have another Frankenstein script called publish.sh. As you can see, it commits and pushes my local changes, then calls the api/reloadstatic endpoint of the server. This will trigger a git pull server side, which pulls in the new front end file changes. Nothing needs to get restarted, the server will just serve the new files.

To deploy changes to the backend code, see the reload.sh script. It is analogous to the publish.sh script, but calls a different endpoint. That endpoint will shut the server down, which gives back control to the loop in the start.sh script.

The next loop iteration pulls in the latest changes from Git, recompiles the Java server, and starts it up again.

Yes, this is a bit insane. But I was/am short on time, and as a single person effort this was the quickest thing I could whip up while mostly keeping my sanity.

My ideal would be to have some sort of blue/green deployment. Build and test a Docker image, tell the Hetzner to fetch the new image and redeploy (gracefully). I know all the pieces of this puzzle except what tool I could use to listen to new images in the repository, pulling them in and redeploying them (hence my Frankenstein script approach above).

Since I have a weekly deadline to get new features into Paperbots for the classes I give each Friday, I currently focus more on the feature side than on the "sane CI/CD" side. I'd be grateful for any suggestions regarding the CI/CD story! I won't be able to get to implementing them until after mid December though :/

Nov 03 '18 14:11 badlogic

Thanks for the detailed description, it sounds a little unconventional but most important it works!

I guess I will start with building immutable containers to be used on your hetzner server not containing any secrets and pushed to dockerhub. This should be a fairly easy exercise and can be then reused for deployments to any hosting provider.

Nov 03 '18 15:11 mbrugger

I added some configuration and dockerfile to build a docker image (mbrugger/paperbots:latest) for each push to the develop branch. https://hub.docker.com/r/mbrugger/paperbots (No secrets stored in the image)

In travis you need to define two environment variables

DOCKER_USERNAME
DOCKER_PASSWORD The image will be pushed to $DOCKER_USERNAME/paperbots:latest A proper versioning strategy is still missing.

For local testing I also added a compose file in the repository.

I prefer environment variables for configuring containers (and it is easier to handle in compose and kubernetes), therefore I wrapped the server startup into a small script updating the configuration file. I would really consider replacing/extending the json file approach with environment variables.

What do you think about the approach of deploying the "production" system with a push to the master branch? I guess it all could be easily triggered through travis or even a webhook in docker hub.

Let me know what you think and once I am done I will send you a pull request.

Nov 04 '18 09:11 mbrugger

I just removed env config yesterday due to a recommendation by @dilbernd. Easy enough to add it back in.

This all looks great to me! I domhave a few questions, since I'm a complete devops beginner:

How would you store secrets to be passed to the container via the environment?
How would you pass that partially secret environment to docker-compose when starting up the containers?
How would tiggering a redeploy of the master image on the machine the containers run work in practice? Is there some service/cron job that scans the registry for a new image, tears down the old containers and starts the new containers?

Thanks for helping with this!

On Sun, Nov 4, 2018, 9:53 AM Martin Brugger <[email protected] wrote:

I added some configuration and dockerfile to build a docker image (mbrugger/paperbots:latest) for each push to the develop branch. https://hub.docker.com/r/mbrugger/paperbots (No secrets stored in the image)

In travis you need to define two environment variables

DOCKER_USERNAME

DOCKER_PASSWORD The image will be pushed to $DOCKER_USERNAME/paperbots:latest A proper versioning strategy is still missing.

For local testing I also added a compose file in the repository.

I prefer environment variables for configuring containers (and it is easier to handle in compose and kubernetes), therefore I wrapped the server startup into a small script updating the configuration file. I would really consider replacing/extending the json file approach with environment variables.

What do you think about the approach of deploying the "production" system with a push to the master branch? I guess it all could be easily triggered through travis or even a webhook in docker hub.

Let me know what you think and once I am done I will send you a pull request.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/badlogic/paperbots/issues/66#issuecomment-435652474, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfYBLqp4k17SwV3LM3tNdDIqFCVyCcsks5urqr7gaJpZM4X8FlR .

Nov 04 '18 09:11 badlogic

Given the limitations/scale of the current project (single shared server) any container orchestration like kubernetes would be overkill. Therefore let's stay with the tools at hand.

How would you store secrets to be passed to the container via the environment?

Simplest solution: maintain a docker-compose.yml on the server (although maintenance is cumbersome, at least the secrets never leave the server) A little more flexible: maintain a separate private git repository containing secrets (still dangerous!) And of course there are dedicated services/tools to handle secrets (e.g. hashicorp vault) or look at the means provided by kubernetes

How would you pass that partially secret environment to docker-compose when starting up the containers?

Answer for this project: Define variables in compose file directly or include through env_file

How would tiggering a redeploy of the master image on the machine the containers run work in practice? Is there some service/cron job that scans the registry for a new image, tears down the old containers and starts the new containers?

Depending on your security requirements..... (no easy answer as well, again kubernetes provides integration points for this problem as well)

The simplest thing for a small scale project would be to trigger the redeployment from the CI platform. As we only have travis I am not sure you would like to share secrets with travis to be able to ssh into your server to pull/restart the application. (I usually use a "push" approach with a private jenkins instance)

I have not used a pull approach polling the docker registry yet which sounds like an option at the moment. Of course the tool will then have full control over docker and all containers.

Nov 04 '18 09:11 mbrugger

@badlogic I added a sample docker-compose file using watchtower to automatically reload when a new image gets pushed and tested it with a dev deployment.

The automatic update works, of course no fancy rolling updates yet but this is something that could be changed in the future.

Nov 04 '18 11:11 mbrugger

I just removed env config yesterday due to a recommendation by @dilbernd.

@badlogic Sorry for the misunderstanding - am not against env config at all, just against writing secrets directly into the compose file env sections (so that the compose file can be committed). I thought that you wanted to move to json conf file, so I prepped the PR in that direction. ¯_(ツ)_/¯

As @mbrugger has noted, your whole setup is a bit unorthodox so it can be hard to tell what is goal and what is incidental.

I suggested committing the docker-compose.yml, but replacing secrets in env sections by ${secretVariableName} and then not committing, but rather maintaining on each server, a file called .env next to the docker-compose.yml which would contain those as variable definitions (same format as Java properties / shell rc, lines of variable=value).

This file is used by compose by default, no need for manually managing env_file, esp. if there’s really just one service, or values that have to be identical between services.

(BTW: ${} substitutions in compose files can have defaults with ${varname:defaultval}, I like to set these to values for production unless secrets, so that prod works by default and local test deployments can still change them around.)

@mbrugger IMO extreme overkill to kube up for this. Vault is also really cool but also overkill compared to a file with 0600 mode.

The simplest thing for a small scale project would be to trigger the redeployment from the CI platform. As we only have travis I am not sure you would like to share secrets with travis to be able to ssh into your server to pull/restart the application. (I usually use a "push" approach with a private jenkins instance)

Maybe the same as reload now, just have a webhook with a password that doesn’t allow anything else? The looper script could easily distinguish by $? from System.exit() param.

Re: Building containers it's also easy to just configure the full docker build (all 3 images) directly in pom.xml with fabric8.io:docker-maven-plugin. Writing Dockerfiles in ANF XML is pretty unfun but on the other hand mvn properties can be used so at least it doesn’t generally have to be touched later.

Nov 04 '18 12:11 dilbernd

@badlogic I added a sample docker-compose file using watchtower to automatically reload when a new image gets pushed and tested it with a dev deployment.

I see this as empty? Does this build static service image or is it still the run/exit/pull/compile loop?

Nov 04 '18 12:11 dilbernd

@dilbernd I guess we are pretty much on the same page.

Now there is a sample docker-compose.yml including content

Nov 04 '18 12:11 mbrugger

@mbrugger IMO extreme overkill to kube up for this. Vault is also really cool but also overkill compared to a file with 0600 mode.

Agreed!

Maybe the same as reload now, just have a webhook with a password that doesn’t allow anything else? The looper script could easily distinguish by $? from System.exit() param.

If you take a look at the no longer empty compose file I added watchtower to trigger updates of the paperbots server container. Works like a charm so far (but I am using it for the first time I must admit)

Re: Building containers it's also easy to just configure the full docker build (all 3 images) directly in pom.xml with fabric8.io:docker-maven-plugin. Writing Dockerfiles in ANF XML is pretty unfun but on the other hand mvn properties can be used so at least it doesn’t generally have to be touched later.

@dilbernd I am not a fan of using maven plugins for all kinds of not java build related stuff. It is mostly a java developers way of solving tasks even if there are dedicated tools for doing a job (usually much better).

As usual it heavily depends on the project/team/requirements so it is also not a general recommendation.

I already had almost religious discussions regarding pros/cons of using maven plugins therefore I am also happy to agree to disagree ;)

Nov 04 '18 12:11 mbrugger

I'll happily go back to env files instead of JSON.

I'm also not a fan of Maven for deployment.

Seems like my reload webhook is actually not that insane after all (the looping bash script kinda is).

So, at the end of all this, changing to envs again, commiting the docker-compose, and using Travis for CI that calls the reload webhook on success, I'm now asking myself what I gain by also building images after CI, and using another moving part for deployment which is easily managed by the current webhook?

Not trying to be difficult, and I LOVE all the input. I just want to avoid complexity where it is not needed.

TL;DR: what do I gain from building Docker images and using watchtower over the setup I have now (env file for secrets, docker-compose commited, CI triggers reload webhook)?

On Sun, Nov 4, 2018, 1:24 PM Martin Brugger <[email protected] wrote:

@mbrugger https://github.com/mbrugger IMO extreme overkill to kube up for this. Vault is also really cool but also overkill compared to a file with 0600 mode.

Agreed!

Maybe the same as reload now, just have a webhook with a password that doesn’t allow anything else? The looper script could easily distinguish by $? from System.exit() param.

If you take a look at the no longer empty compose file I added watchtower to trigger updates of the paperbots server container. Works like a charm so far (but I am using it for the first time I must admit)

Re: Building containers it's also easy to just configure the full docker build (all 3 images) directly in pom.xml with fabric8.io:docker-maven-plugin. Writing Dockerfiles in ANF XML is pretty unfun but on the other hand mvn properties can be used so at least it doesn’t generally have to be touched later.

@dilbernd https://github.com/dilbernd I am not a fan of using maven plugins for all kinds of not java build related stuff. It is mostly a java developers way of solving tasks even if there are dedicated tools for doing a job (usually much better).

As usual it heavily depends on the project/team/requirements so it is also not a general recommendation.

I already had almost religious discussions regarding pros/cons of using maven plugins therefore I am also happy to agree to disagree ;)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/badlogic/paperbots/issues/66#issuecomment-435665245, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfYBAxShMxo-sxJtRGCBpyYuezynf48ks5urtx4gaJpZM4X8FlR .

Nov 04 '18 12:11 badlogic

Using docker without building images is pretty much abandoning the main advantage of docker/ immutable containers. The way you do it:

The application is built on container startup
All dependencies have to be available. (hopefully no snapshot dependencies that change during build)
Increased startup times because of additional build
Build dependencies(npm, maven) necessary during container start (more moving parts)
If you ever want to scale the deployment, every container started would run the same build again...

Everybody else using the project will be happy to get a container + compose file to just run the application without the need to know anything about the build.

It still might be ok for your current project to leave it as it is but in general it is good to follow best practices even not knowing all the details avoiding to make the same mistakes somebody else made before.

Last but not least: the (fully automated) setup for build + deployment is already working and looks good to me.

Nov 04 '18 12:11 mbrugger

@mbrugger Generally don't like non-Java stuff in mvn either. But IME you end up needing bespoke tooling or some integration for CI/CD to avoid having it chafe, and this seems too small for the former.

I’d recommend heavily against things like watchtower: Docker said that anything that can r/w the Docker socket is functionally root on the host machine, have never seen a retraction with newer version.

@badlogic It’s not mvn deployment in the sense "to the host", just building images & deploying to the registry. Still not pretty for sure.

@mbrugger newer comment: +1 on all points: Generally building one and done containers is just way more robust and faster at point of use; but if the current way works for you keep it for now. Big ops up front is rarely a win.

Nov 04 '18 13:11 dilbernd

Everybody else using the project will be happy to get a container + compose file to just run the application without the need to know anything about the build.

BTW that (not prod host deployment) was my goal: Thought of building images from mvn + adding a parameter to paperbots standalone jar to write a (mvn property filtered during build for versions &c) docker-compose.yml and .env file with matching config for all containers to the CWD.

My motivation was super-easy onboarding for new developers, to have students who are finished with working in paperbots can "go meta" and get their own paperbots env up easily.

I.e. without dealing with the whole software development bullshit up front. They still need to install Docker for x/git/JDK/mvn, but might need no working knowledge of the whole support tooling immediately.

Nov 04 '18 13:11 dilbernd

BTW that (not prod host deployment) was my goal ...

Agreed, I was aiming towards a staging/production setup, not local dev environment.

@mbrugger Generally don't like non-Java stuff in mvn either. But IME you end up needing bespoke tooling or some integration for CI/CD to avoid having it chafe, and this seems too small for the former.

You are correct in this regards, but travis is already there and building the images for free

I’d recommend heavily against things like watchtower: Docker said that anything that can r/w the Docker socket is functionally root on the host machine, have never seen a retraction with newer version.

My first thought was pretty much the same, but is there a good way to pull images/restart containers without access to the docker socket? I would consider watchtower more like a part of the infrastructure, not the application and then it seems like a more reasonable approach. I have read about the same architecture(polling the registry for updates) in kubernetes case studies before, but was never considering it in such a small scale project yet.

Nov 04 '18 13:11 mbrugger

@mbrugger @dilbernd cheers, that all makes sense to me, though my current redeploy times are whatever my ping tomtue server is plus git pull for static frontend file changes, and that plus 4 seconds of a Maven build. The Maven artifacts and Node modules are actually part of the Docker image to make this so "fast". Building the image on Travis and pulling that in surely takes quite a bit longer. I optimized for personal use at the moment, where super quick iteration times are really important for me.

Atm, I use Docker as a way to isolate different sites on the same machine from each other.

Just wanted to explain some of the insanity.

I'll give your setup a try @mbrugger. It might just linger in the repository for a while until I'm not as dependant on quick iteration times any longer.

@dilbert I actually got a student that went meta. All he had to do is install Node/NPM, then client/npm run dev-without-java. I only have him dig into the frontend code. I will create docs on how to do local development today.

Again, thanks for all your input!

On Sun, Nov 4, 2018, 2:37 PM Martin Brugger <[email protected] wrote:

BTW that (not prod host deployment) was my goal ...

Agreed, I was aiming towards a staging/production setup, not local dev environment.

@mbrugger https://github.com/mbrugger Generally don't like non-Java stuff in mvn either. But IME you end up needing bespoke tooling or some integration for CI/CD to avoid having it chafe, and this seems too small for the former.

You are correct in this regards, but travis is already there and building the images for free https://travis-ci.org/mbrugger/paperbots/jobs/450454037

I’d recommend heavily against things like watchtower: Docker said that anything that can r/w the Docker socket is functionally root on the host machine, have never seen a retraction with newer version.

My first thought was pretty much the same, but is there a good way to pull images/restart containers without access to the docker socket? I would consider watchtower more like a part of the infrastructure, not the application and then it seems like a more reasonable approach. I have read about the same architecture(polling the registry for updates) in kubernetes case studies before, but was never considering it in such a small scale project yet.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/badlogic/paperbots/issues/66#issuecomment-435670363, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfYBBzvd2YpGUeoya67HrnF8KlOludXks5uru2RgaJpZM4X8FlR .

Nov 04 '18 14:11 badlogic