zimfarm icon indicating copy to clipboard operation
zimfarm copied to clipboard

Proposed Docker Compose setup for workers (with support for relaying traffic through a static IP w/ Wireguard)

Open pirate opened this issue 5 years ago • 11 comments

I've been working on a solution for my home-based Zimfarm host (with a dynamic IP) to connect to a VPS server with a static IP. It uses wireguard to tunnel the traffic of a container through a Wireguard VPN host on a remote server.

This is the docker setup on the "client" (zimfarm worker): docker-compose.yml

version: '3'

services:
  wireguard:
    image: linuxserver/wireguard
    cap_add:
      - NET_ADMIN
      - SYS_MODULE
    volumes:
      - /lib/modules:/lib/modules
      - ./wg0.conf:/config/wg0.conf:ro

  zimfarm:
    image: ghcr.io/openzim/zimfarm
    network_mode: 'service:wireguard'
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - ./data/zimfarm:/data

wg0.conf:

[Interface]
# Name = myWorkerName.wg.openzim.org
Address = 10.17.17.2/32
PrivateKey = YCW76edD4W7nZrPbWZxPZhcs32CsBLIi1sEhsV/sgk8=
DNS = 1.1.1.1,8.8.8.8


[Peer]
# Name = relay.wg.openzim.org
Endpoint = relay.wg.openzim.org:51820
PublicKey = zJNKewtL3gcHdG62V3GaBkErFtapJWsAx+2um0c0B1s=
AllowedIPs = 10.17.17.1/24,0.0.0.0/0
PersistentKeepalive = 21

(the VPN server side config is a very simple, bog-standard Wireguard server, so I'll omit it here, this issue only concerns the zimfarm client/worker setup)

This is the easiest way to run a container's internet traffic through wireguard, though there are some other more difficult ways involving modifying IPtables on the host (which I'd rather not do on my machine, and is also difficult because the containers are spawned dynamically so DHCP+wg-dynamic or some other solution must be used to give each container an IP). https://github.com/pirate/wireguard-docs/blob/master/README.md#Containerization

The issue is that zimfarm doesn't work as a single container, it instead takes control of docker on the host machine to spawn multiple other containers, which means it's more difficult to get it to run all those containers through wireguard.

Possible solutions:

  • make zimfarm able to run as a single container (probably infeasible based on its architecture)
  • make zimfarm able to run using docker-in-docker (maybe difficult, not sure if a good idea)
  • add an option to zimfarm so that it can spawn all of its containers in a docker-compose project with network_mode: 'service:wireguard' on each, such that all their traffic runs through the wireguard container's networking stack
  • use IPtables and Wireguard on the zimfarm worker host outside of docker to force docker traffic into a wireguard tunnel (last resort)

pirate avatar May 29 '20 23:05 pirate

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Jul 29 '20 07:07 stale[bot]

How much transfer per month is typical for a zimfarm worker? Would it exceed 1TB/mo?

I ask because I'm going to set up a VPN bounce machine to try and get my zimfarm rack server up again, and am wondering how much to budget for bandwidth.

pirate avatar Dec 10 '20 21:12 pirate

@pirate Good news, but question hard if not impossible to answer. It depends... but if it ever goes over 1TB, it should really not be from that much.

kelson42 avatar Dec 11 '20 07:12 kelson42

Looked at the AWS-hosted worker we had and the last bill says:

Item Value
data transfer in per month 1,019.211 GB
first 1 GB of data transferred out per month 0.949 GB
regional data transfer - in/out/between EC2 AZs or using elastic IPs or ELB 0.116 GB
first 10 TB / month data transfer out beyond the global free tier 269.521 GB

rgaudin avatar Dec 11 '20 07:12 rgaudin

This dates back from August.

rgaudin avatar Dec 11 '20 07:12 rgaudin

Ok, I was thinking of using DigitalOcean which is 1TB for free, with $0.02 USD/GB after that. Hoping it's not too much higher because then it goes from a $5/mo project to a $20+/mo project. I will keep looking at alternative hosting providers, maybe I can find one with free bandwidth up to 2TB.

pirate avatar Dec 11 '20 11:12 pirate

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Feb 21 '21 23:02 stale[bot]

Do you know if zimfarm-worker-manager able to manage spawning other containers (using /var/run/docker.sock) headlessly over months with minimal setup? i.e. can I run it in compose like this:

  zimfarm:
    image: openzim/zimfarm-worker-manager
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

I'm about to publish a new repo called the https://github.com/pirate/good-karma-kit to run on servers with spare CPU/RAM/bandwidth and I think it could help get you a decent number of people running this.

Ideally I'd like to make it as simple/one-click as possible, but even considering the >1TB bandwidth, CPU, docker.sock access, and fixed IP requirements I bet we can get you a few good zimfarm worker contributors.

pirate avatar Apr 12 '21 19:04 pirate

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Jun 16 '21 22:06 stale[bot]

@pirate We have implemented a support for dynamic IP in #659. Would that allow you to give another try?

kelson42 avatar Mar 23 '22 07:03 kelson42

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] avatar Jun 12 '22 20:06 stale[bot]