inventaire-deploy icon indicating copy to clipboard operation
inventaire-deploy copied to clipboard

how far are we from just using Docker in prod

Open maxlath opened this issue 4 years ago • 5 comments

First, let's remind ourselves why Docker is attractive for our use-cases:

It makes re-using a given version of a software much easier, in a way that is very independent from the host OS. For instance, in our case, relying on apt-get to install CouchDB meant for a while being blocked on v1.6.1, while that was a version with publicly known security breaches. Now that means being blocked on CouchDB v3.1.0, and not having a package for the recently released Ubuntu 20.04 LTS. Docker allows to install any version of CouchDB, on any OS where Docker can be installed. That's an undeniable advantage over just relying on apt-get or equivalent.

It could offer a standardized deploy recipe format. Unfortunately, this potentially has proved to be difficult to use so far, as one has to either:

  • use pre-built images, that then need to be pulled: where are those images stored? how to handle the config? isn't that an unnecessarily heavy process?
  • build images locally: requires some scripting outside of Docker, kind of defeating the initial goal

The problem: Docker and network security

ufw is a beautifully simple abstraction layer on top of iptables: once you have set ufw default deny incoming, you can trust that any port that isn't whitelisted isn't accessible from the outside world. It works awesome as long as all the services just bind to a port and don't care further about network. Unfortunately, Docker messes directly with iptables, making those all the previous assumptions void. It is possible to disable iptables manipulation, but it comes with a dose of undocumented fear, uncertainty and doubts: "this option is not appropriate for most users [...]. Setting iptables to false will more than likely break container networking for the Docker engine." [Edit: after testing locally, setting /etc/docker/daemon.json to {"iptables": false}, network requests where just not getting out anymore, so that's definitly having an undesired effect]

Possible solutions

  • learn to use iptables without ufw, to have a reliable setup where we can have a high level of trust that Docker isn't
  • use Docker configuration to make it comply with rules set by ufw: requires to identify where the documentation FUD is coming from
  • do not use Docker in production

What we would lose with Docker

  • ufw simplicity, as described above
  • systemd control and logging, which can be mostly replaced by docker logs, but it's not as good as journalctl and we would need to reimplement logs backup for this new setup

maxlath avatar May 27 '20 13:05 maxlath

Advantages of containerizing :

  • copy/moving production data to a testing or pre-production environment
  • easily backing up containers (docker save)
  • possibility of CI (maybe through scripts on the supervisor, maybe through some dokku -like solution which builds images locally)
  • possibility of merging/syncing/networking main and alt server

Drawbacks :

  • more complicated deployment if running databases outside containers (discussed here). It seems advisable for large infra, but inventaire is clearly not dealing with 10k requests/hour
  • configuration
    • of iptables : a few details here [fr]. Maybe im missing some points, but whats more in configuring iptables for a webserver than allow everyone to access 80 and 443 and dropping everything else?
    • of the supervisor : creating a docker user for example. Maybe having a namespace to provide better protection against privilege escalation.
    • docker{,-compose}: volumes or not, env variables

About the logs, docker logs can also be accessed by journalctl -u docker, maybe that could be enough to not change logs backup setup

jum-s avatar Sep 06 '20 15:09 jum-s

some reading in the links https://dan.hersam.com/2020/11/06/local-docker-port-exposed/

jum-s avatar Nov 07 '20 11:11 jum-s

some more readings/propaganda https://docs.docker.com/engine/security/

Docker allows you to share a directory between the Docker host and a guest container; and it allows you to do so without limiting the access rights of the container

jum-s avatar Jan 28 '21 12:01 jum-s

https://sandbox.inventaire.io is now running on a rootless docker engine, which is briefly documented here

jum-s avatar Feb 16 '21 19:02 jum-s

Drawbacks :

* more complicated deployment if running databases outside containers ([discussed here](https://github.com/inventaire/docker-inventaire/issues/7)). It [seems advisable](https://www.quora.com/Is-it-not-advisable-to-use-database-in-Docker-container) for large infra, but inventaire is clearly not dealing with 10k requests/hour

This shouldn't be a problem nowadays? Besides, most modern deployments are either local/small deployments (so having small DB isn't problematic) or large, clustered deployments with dedicated database machine?

* configuration
  
  * of iptables : [a few details here [fr]](https://riptutorial.com/fr/docker/topic/9201/iptables-avec-docker). Maybe im missing some points, but whats more in configuring iptables for a webserver than allow everyone to access 80 and 443 and dropping everything else?

Shouldn't security be handled outside, i.e. on the host machine of the docker or better yet, on the external LB/firewall? Docker/machine with deployed container should only deal with maintaining container up and running and the image itself should be concerned with only serving the request, nothing more?

  * of the supervisor : creating a docker user for example. Maybe having [a namespace](https://success.mirantis.com/article/introduction-to-user-namespaces-in-docker-engine) to provide better protection against privilege escalation.
  * docker{,-compose}: volumes or not, env variables

This is relatively trivial, you can either have data inside docker volume or expose it to host machine (so it's easier to migrate elsewhere and it's more resilient/persistent)

About the logs, docker logs can also be accessed by journalctl -u docker, maybe that could be enough to not change logs backup setup

Most of the time in production logging should be minimal either way? :-)

woj-tek avatar Jul 31 '23 05:07 woj-tek