inventaire-deploy
inventaire-deploy copied to clipboard
how far are we from just using Docker in prod
First, let's remind ourselves why Docker is attractive for our use-cases:
It makes re-using a given version of a software much easier, in a way that is very independent from the host OS. For instance, in our case, relying on apt-get
to install CouchDB meant for a while being blocked on v1.6.1
, while that was a version with publicly known security breaches. Now that means being blocked on CouchDB v3.1.0
, and not having a package for the recently released Ubuntu 20.04 LTS. Docker allows to install any version of CouchDB, on any OS where Docker can be installed. That's an undeniable advantage over just relying on apt-get
or equivalent.
It could offer a standardized deploy recipe format. Unfortunately, this potentially has proved to be difficult to use so far, as one has to either:
- use pre-built images, that then need to be pulled: where are those images stored? how to handle the config? isn't that an unnecessarily heavy process?
- build images locally: requires some scripting outside of Docker, kind of defeating the initial goal
The problem: Docker and network security
ufw
is a beautifully simple abstraction layer on top of iptables
: once you have set ufw default deny incoming
, you can trust that any port that isn't whitelisted isn't accessible from the outside world. It works awesome as long as all the services just bind to a port and don't care further about network. Unfortunately, Docker messes directly with iptables
, making those all the previous assumptions void. It is possible to disable iptables
manipulation, but it comes with a dose of undocumented fear, uncertainty and doubts: "this option is not appropriate for most users [...]. Setting iptables to false will more than likely break container networking for the Docker engine." [Edit: after testing locally, setting /etc/docker/daemon.json
to {"iptables": false}
, network requests where just not getting out anymore, so that's definitly having an undesired effect]
Possible solutions
- learn to use
iptables
withoutufw
, to have a reliable setup where we can have a high level of trust that Docker isn't - use Docker configuration to make it comply with rules set by ufw: requires to identify where the documentation FUD is coming from
- do not use Docker in production
What we would lose with Docker
-
ufw
simplicity, as described above - systemd control and logging, which can be mostly replaced by
docker logs
, but it's not as good asjournalctl
and we would need to reimplement logs backup for this new setup
Advantages of containerizing :
- copy/moving production data to a testing or pre-production environment
- easily backing up containers (
docker save
) - possibility of CI (maybe through scripts on the supervisor, maybe through some dokku -like solution which builds images locally)
- possibility of merging/syncing/networking main and alt server
Drawbacks :
- more complicated deployment if running databases outside containers (discussed here). It seems advisable for large infra, but inventaire is clearly not dealing with 10k requests/hour
- configuration
- of iptables : a few details here [fr]. Maybe im missing some points, but whats more in configuring iptables for a webserver than allow everyone to access 80 and 443 and dropping everything else?
- of the supervisor : creating a docker user for example. Maybe having a namespace to provide better protection against privilege escalation.
- docker{,-compose}: volumes or not, env variables
About the logs, docker logs can also be accessed by journalctl -u docker
, maybe that could be enough to not change logs backup setup
some reading in the links https://dan.hersam.com/2020/11/06/local-docker-port-exposed/
some more readings/propaganda https://docs.docker.com/engine/security/
Docker allows you to share a directory between the Docker host and a guest container; and it allows you to do so without limiting the access rights of the container
https://sandbox.inventaire.io is now running on a rootless docker engine, which is briefly documented here
Drawbacks :
* more complicated deployment if running databases outside containers ([discussed here](https://github.com/inventaire/docker-inventaire/issues/7)). It [seems advisable](https://www.quora.com/Is-it-not-advisable-to-use-database-in-Docker-container) for large infra, but inventaire is clearly not dealing with 10k requests/hour
This shouldn't be a problem nowadays? Besides, most modern deployments are either local/small deployments (so having small DB isn't problematic) or large, clustered deployments with dedicated database machine?
* configuration * of iptables : [a few details here [fr]](https://riptutorial.com/fr/docker/topic/9201/iptables-avec-docker). Maybe im missing some points, but whats more in configuring iptables for a webserver than allow everyone to access 80 and 443 and dropping everything else?
Shouldn't security be handled outside, i.e. on the host machine of the docker or better yet, on the external LB/firewall? Docker/machine with deployed container should only deal with maintaining container up and running and the image itself should be concerned with only serving the request, nothing more?
* of the supervisor : creating a docker user for example. Maybe having [a namespace](https://success.mirantis.com/article/introduction-to-user-namespaces-in-docker-engine) to provide better protection against privilege escalation. * docker{,-compose}: volumes or not, env variables
This is relatively trivial, you can either have data inside docker volume or expose it to host machine (so it's easier to migrate elsewhere and it's more resilient/persistent)
About the logs, docker logs can also be accessed by
journalctl -u docker
, maybe that could be enough to not change logs backup setup
Most of the time in production logging should be minimal either way? :-)