moby icon indicating copy to clipboard operation
moby copied to clipboard

Random networking failures in bridged network

Open cheesycod opened this issue 10 months ago • 17 comments

Description

[
    {
        "Name": "antiraid_internal",
        "Id": "3237097f63b8ff00d093454b9a01c17745baf6ddafb60e0423bb68c3c9710b02",
        "Created": "2025-02-24T13:17:37.76494543-05:00",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv4": true,
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.18.0.0/16",
                    "Gateway": "172.18.0.1"
                }
            ]
        },
        "Internal": true,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "11c78f924df8efc19549e41c0a7eddcf353b31d26c305a6549841c8553db1d86": {
                "Name": "jobserver",
                "EndpointID": "3084a13d5fdc3b9d97a975d4c562f7fb7b43621792db9506aeef8df6bad2c025",
                "MacAddress": "12:f2:2d:97:c1:f7",
                "IPv4Address": "172.18.0.7/16",
                "IPv6Address": ""
            },
            "2b1a17c5693828c21c83e0f3e4473e95c3de7df8b49b506e6eb1e055a244eebb": {
                "Name": "sandwich",
                "EndpointID": "932810504b316a9f6276bec7fd3e44148a3441b7e6592e3efccf1f2dd966813f",
                "MacAddress": "86:d8:b9:0a:70:5f",
                "IPv4Address": "172.18.0.3/16",
                "IPv6Address": ""
            },
            "39bbc756d6f8fd5bc21458713e6f40037487e92e67f8922225728347fda1f4b9": {
                "Name": "nirn_proxy",
                "EndpointID": "cb0a5043ce6c40ad3d4f07115b14911e668c4c1d5c95b9125a5893dd24bdf1cb",
                "MacAddress": "be:72:84:69:aa:b9",
                "IPv4Address": "172.18.0.5/16",
                "IPv6Address": ""
            },
            "3af5bcfc0dfcfd11acd4d3cf1f5c29b3b6124afda602dbc3675af9743fa48e39": {
                "Name": "template-worker",
                "EndpointID": "fe2f6ec05ba5aebff77ef6789abab154b841e11ec3402b281cc378d91402c40a",
                "MacAddress": "0e:69:8b:fb:c0:fe",
                "IPv4Address": "172.18.0.6/16",
                "IPv6Address": ""
            },
            "a7eab9b2041f510745a5526a74d1fe76ffef5478d23026116515826c77d11591": {
                "Name": "api_exposer",
                "EndpointID": "5ad2a95665fe291a1bd33c15c9642b7e66cc731c86aab5bd1d07fa647986b484",
                "MacAddress": "4a:79:1a:f5:8f:89",
                "IPv4Address": "172.18.0.10/16",
                "IPv6Address": ""
            },
            "ac412590d63e42c17907fea26a8bcce2853b4693878814ffb2419e00cc46f1a8": {
                "Name": "postgres",
                "EndpointID": "b28e66d36bf681a260d5816ac00a7265c9d7b9db9d6c41ebfb19c9cff5d01920",
                "MacAddress": "1a:c4:eb:be:0c:2d",
                "IPv4Address": "172.18.0.2/16",
                "IPv6Address": ""
            },
            "ba1eebf648c0700e5ff34877ec8c0ba004f777db4c24b537b9c8ccb7d2d004f1": {
                "Name": "bot",
                "EndpointID": "e269c3d2e748c481fda0a27b175e130504a02cf2ad52599f3a5761a5f812b732",
                "MacAddress": "02:56:dd:05:96:29",
                "IPv4Address": "172.18.0.8/16",
                "IPv6Address": ""
            },
            "c8df3812331015a199f46209b40ddf2d57b7150d2cf600e2c90031a00a9e4ecb": {
                "Name": "seaweed",
                "EndpointID": "6228db70902b0d0bb59204ee4a58f868ad695b25412e68108d302925e9b0f9bb",
                "MacAddress": "36:c6:b9:22:ac:97",
                "IPv4Address": "172.18.0.4/16",
                "IPv6Address": ""
            },
            "ddea36d3a1a96a9e35441db1b475f11913aec3f82b014c173873370d46c1a638": {
                "Name": "api",
                "EndpointID": "61647cad073cb5bb903e8882cc1d97862624b33acd79ebb419f4031ca3c7189e",
                "MacAddress": "9a:4c:ae:8c:7f:b3",
                "IPv4Address": "172.18.0.9/16",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {
            "com.docker.compose.config-hash": "fae475f830a3a5a91e5d11513ea16825a3f95dfce5c3992c00c24dd71cd4be50",
            "com.docker.compose.network": "antiraid_internal",
            "com.docker.compose.project": "staging",
            "com.docker.compose.version": "2.33.0"
        }
    }
]
frostpaw@REDACTED:~/staging/services$ docker exec api curl http://172.18.0.4:8333
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (7) Failed to connect to 172.18.0.4 port 80 after 0 ms: Couldn't connect to server
exit status 7
frostpaw@REDACTED:~/staging/services$ docker exec api curl http://172.18.0.4:8333
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (7) Failed to connect to 172.18.0.4 port 80 after 0 ms: Couldn't connect to server
exit status 7
frostpaw@REDACTED:~/staging/services$ 
frostpaw@REDACTED:~/staging/services$ docker exec api curl http://seaweed:8333
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (7) Failed to connect to seaweed port 8333 after 0 ms: Couldn't connect to server
exit status 7

The last two curl commands randomly either fail or succeed randomly

The below docker compose was used:

services:
  # Nirn Proxy
  nirn_proxy:
    container_name: nirn_proxy
    networks:
      - antiraid_infra
      - antiraid_internal # Exposed to internal services
    depends_on:
      sandwich:
        condition: service_healthy
    build:
      context: ./infra/nirn-proxy
      dockerfile: ./Dockerfile
    volumes:
      - ./infra/nirn-proxy/secrets.docker.json:/secrets.json:ro
    command: cache-endpoints=false port=3221 ratelimit-over-408 endpoint-rewrite=/api/gateway/bot@http://sandwich:29334/antiraid,/api/v*/gateway/bot@http://sandwich:29334/antiraid token-map-file=secrets.json
    healthcheck:
      test: [ "CMD", "curl", "-f", "http://localhost:3221/nirn/healthz" ]
      interval: 3s
      timeout: 10s
      retries: 3
    ports:
      - "3222:3221" # And exposed to host for testing/auditing
    labels:
      type: "discord-rest"

  # Sandwich
  sandwich:
    container_name: sandwich
    networks:
      - antiraid_infra
      - antiraid_internal # Exposed to internal services
    build:
      context: ./infra/Sandwich-Daemon
      dockerfile: ./Dockerfile
    volumes:
      - ./infra/Sandwich-Daemon/sandwich.docker.yaml:/sandwich.yaml:ro
    command: ./app/sandwich -configurationPath=/sandwich.yaml -prometheusAddress :3931 -httpEnabled --httpHost 0.0.0.0:29334 -level debug
    environment:
      EXTERNAL_GATEWAY_ADDRESS: ws://sandwich:3600
    healthcheck:
      test: [ "CMD", "curl", "-f", "http://localhost:29334/antiraid/api/v9/gateway/bot" ]
      interval: 10s
      timeout: 10s
      retries: 1000000000
    ports:
      - "29334:29334" # Exposed to host for testing/auditing
      - "3931:3931" # Prometheus metrics
      - "3600:3600" # Websocket connection for the bot
    labels:
      type: "discord-gateway"

  # Postgres database
  postgres:
    container_name: postgres
    networks:
      - antiraid_internal # Exposed to internal services
    build:
      context: ./data/docker/postgres
      dockerfile: ./Dockerfile
    # Expose /seed.iblcli-seed
    volumes:
      - ./data/seed.iblcli-seed:/seed.iblcli-seed:ro
      - ./data/state/postgres-other:/var/lib/postgresql
      - ./data/state/postgres:/var/lib/postgresql/data
    healthcheck:
      test: [ "CMD", "pg_isready", "-U", "antiraid" ]
      interval: 3s
      timeout: 10s
      retries: 3
    labels:
      type: "database"

  # Seaweed needs a postgres database to function
  seaweed_postgres:
    container_name: seaweed_postgres
    networks:
      - antiraid_seaweed # Exposed to SeaweedFS services
    build:
      context: ./data/docker/seaweed-postgres
      dockerfile: ./Dockerfile
    volumes:
      - ./data/state/seaweed_postgres-other:/var/lib/postgresql
      - ./data/state/seaweed_postgres:/var/lib/postgresql/data
    healthcheck:
      test: [ "CMD", "pg_isready", "-U", "seaweed" ]
      interval: 3s
      timeout: 10s
      retries: 3

  # Seaweed FS itself 
  seaweed:
    container_name: seaweed
    build:
      context: ./data/docker/seaweed
      dockerfile: ./Dockerfile
    networks:
      - antiraid_internal # Exposed to internal services
      - antiraid_seaweed # Exposed to SeaweedFS services
    depends_on:
      seaweed_postgres:
        condition: service_healthy
    command: server -filer -s3 -volume.max=100 -master.port=9333 -volume.port=9334 -master.volumeSizeLimitMB=4096 -filer.encryptVolumeData
    volumes:
      - ./data/state/seaweed:/data
      - ./data/state/seaweed-config:/etc/seaweedfs
    healthcheck:
      test: [ "CMD", "curl", "-f", "http://localhost:9333/cluster/status" ]
      interval: 10s
      timeout: 10s
      retries: 30

  # Bot process itself
  bot:
    container_name: bot
    networks:
      - antiraid_internal # Exposed to internal services
    depends_on:
      nirn_proxy:
        condition: service_healthy
      sandwich:
        condition: service_healthy
      postgres:
        condition: service_healthy
      template-worker:
        condition: service_healthy
    build:
      context: ./services/bot
      dockerfile: ./Dockerfile
    volumes:
      - ./config.docker.yaml:/app/config.yaml:ro
    healthcheck:
      test: [ "CMD", "curl", "-f", "http://127.0.0.1:20000/state" ]
      interval: 3s
      timeout: 10s
      retries: 10
    labels:
      type: "service"

  # Template worker process
  # The template worker is required for the bot to function as it handles Luau
  # scripting and permissions etc.
  template-worker:
    container_name: template-worker
    networks:
      - antiraid_internal # Exposed to internal services
    depends_on:
      nirn_proxy:
        condition: service_healthy
      sandwich:
        condition: service_healthy
      postgres:
        condition: service_healthy
    build:
      context: ./services/template-worker
      dockerfile: ./Dockerfile
    volumes:
      - ./config.docker.yaml:/app/config.yaml:ro
    environment:
      RUST_LOG: "template-worker=info"
    healthcheck:
      test: [ "CMD", "curl", "-X", "POST", "-f", "http://localhost:60000/healthcheck" ]
      interval: 3s
      timeout: 10s
      retries: 100
    labels:
      type: "service"

  # Redis
  api_redis:
    container_name: api_redis
    networks:
      - antiraid_api # Only communicates with API
    image: redis:7.2.7 # Use 7.2.7 to avoid licensing change with 7.4+
    expose:
      - 6379
    command: redis-server --save 900 1 --save 300 10 --save 60 10000 --loglevel notice --bind 0.0.0.0
    volumes:
      - ./data/state/redis:/data
    healthcheck:
      test: [ "CMD", "redis-cli", "ping" ]
      interval: 3s
      timeout: 10s
      retries: 3
    labels:
      type: "cache"

  # Jobserver process
  jobserver:
    container_name: jobserver
    networks:
      - antiraid_internal # Exposed to internal services
      - antiraid_jobserver # Needs external net access too
    depends_on:
      nirn_proxy:
        condition: service_healthy
      sandwich:
        condition: service_healthy
      postgres:
        condition: service_healthy
    build:
      context: ./services/jobserver
      dockerfile: ./Dockerfile
    volumes:
      - ./config.docker.yaml:/app/config.yaml:ro
    healthcheck:
      test: [ "CMD", "curl", "-f", "http://localhost:30000" ]
      interval: 3s
      timeout: 10s
      retries: 10
    labels:
      type: "service"

  # API process
  # Like all other processes, the API process does not have external net access
  # outside of Nirn proxy, however, the api_exposer Nginx proxy is used to bridge
  # the gap between the API and the outside world.
  api:
    container_name: api
    networks:
      - antiraid_internal # Doesn't need external net access
      - antiraid_api
    depends_on:
      api_redis:
        condition: service_started
      nirn_proxy:
        condition: service_started
      postgres:
        condition: service_healthy
      bot:
        condition: service_healthy
    build:
      context: ./services/api
      dockerfile: ./Dockerfile
    volumes:
      - ./config.docker.yaml:/app/config.yaml:ro
    labels:
      type: "service"
      special-networking-used: "api" # API has special networking needs, so we document that as a nice label

  # Nginx process to expose the API outwards while ensuring API itself doesn't get any
  # external net access. Bridges external and internal layers
  api_exposer:
    container_name: api_exposer
    networks:
      - antiraid_api
      - antiraid_internal # Exposed to internal services
      - antiraid_infra # Needs access to the outside world
    depends_on:
      api:
        condition: service_started
      seaweed:
        condition: service_healthy
    image: nginx:1.27.4-alpine
    volumes:
      - ./data/docker/nginx.conf:/etc/nginx/conf.d/default.conf:ro
    ports:
      - "5600:5600" # Expose API to host
      - "5601:5601" # Expose SeaweedFS to host
    healthcheck:
      test: [ "CMD", "curl", "-f", "http://localhost:5600/docs/splashtail" ]
      interval: 3s
      timeout: 10s
      retries: 3

networks:
  antiraid_infra:
    name: antiraid_infra
    driver: bridge
    internal: false # Infra needs external access
  antiraid_internal:
    name: antiraid_internal
    driver: bridge
    internal: true # No external net access
  antiraid_api:
    # Used for internal API services (Redis/Nginx etc.)
    name: antiraid_api
    driver: bridge
    internal: true # No external net access
  antiraid_seaweed:
    # Used for internal SeaweedFS services
    name: antiraid_seaweed
    driver: bridge
    internal: true # No external net access
  antiraid_jobserver:
    # Used for internal Jobserver services
    name: antiraid_jobserver
    driver: bridge
    internal: false # Needs external net access

Reproduce

  1. docker compose up
  2. Try the above curl commands
  3. It randomly succeeds on a docker compose up run or randomly fails

Expected behavior

The above curl commands should either always succeed or always fail if the network config above is wrong (which I doubt but yeah)

docker version

Client: Docker Engine - Community
 Version:           28.0.0
 API version:       1.48
 Go version:        go1.23.6
 Git commit:        f9ced58
 Built:             Wed Feb 19 22:11:04 2025
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          28.0.0
  API version:      1.48 (minimum version 1.24)
  Go version:       go1.23.6
  Git commit:       af898ab
  Built:            Wed Feb 19 22:11:04 2025
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.7.25
  GitCommit:        bcc810d6b9066471b0b6fa75f557a15a1cbf31bb
 runc:
  Version:          1.2.4
  GitCommit:        v1.2.4-0-g6c52b3f
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client: Docker Engine - Community
 Version:    28.0.0
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.21.0
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.33.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 11
  Running: 11
  Paused: 0
  Stopped: 0
 Images: 84
 Server Version: 28.0.0
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: bcc810d6b9066471b0b6fa75f557a15a1cbf31bb
 runc version: v1.2.4-0-g6c52b3f
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 5.15.167.4-microsoft-standard-WSL2
 Operating System: Ubuntu 24.04.1 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 20
 Total Memory: 7.592GiB
 Name: REDACTED
 ID: 5893e8ad-7ea5-4940-a268-0cd7dff1f4f0
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  ::1/128
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No blkio throttle.read_bps_device support
WARNING: No blkio throttle.write_bps_device support
WARNING: No blkio throttle.read_iops_device support
WARNING: No blkio throttle.write_iops_device support

Additional Info

This is running in a WSL environment:

frostpaw@REDACTED:~/staging/services$ uname -a
Linux REDACTED 5.15.167.4-microsoft-standard-WSL2 #1 SMP Tue Nov 5 00:21:55 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

All REDACTED's are my computers hostname and are omitted for privacy reasons

cheesycod avatar Feb 24 '25 20:02 cheesycod

Hi @cheesycod - this is very likely to be fixed by https://github.com/moby/moby/pull/49518, there's an issue in 28.0.0 with rules getting out of order when there are extra rules in the iptables filter-FORWARD chain.

If you want to send the output of iptables -nvL from when it's broken, I can double-check.

The fixes should be available in a 28.0.1 release in the next couple of days.

robmry avatar Feb 25 '25 09:02 robmry

Hi @cheesycod - this is very likely to be fixed by #49518, there's an issue in 28.0.0 with rules getting out of order when there are extra rules in the iptables filter-FORWARD chain.

If you want to send the output of iptables -nvL from when it's broken, I can double-check.

The fixes should be available in a 28.0.1 release in the next couple of days.

I have already downgraded to 27.5.1 but if it happens again in 28.0.1 or 27.5.1, I’ll update you!

[this is my school/uni account]

rhit-sashiks avatar Feb 25 '25 17:02 rhit-sashiks

Hi @cheesycod - this is very likely to be fixed by #49518, there's an issue in 28.0.0 with rules getting out of order when there are extra rules in the iptables filter-FORWARD chain.

If you want to send the output of iptables -nvL from when it's broken, I can double-check.

The fixes should be available in a 28.0.1 release in the next couple of days.

This persists in 27.5.1 as well

iptables -nvL output:

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
 5801 2899K DOCKER-USER  0    --  *      *       0.0.0.0/0            0.0.0.0/0           
 5801 2899K DOCKER-ISOLATION-STAGE-1  0    --  *      *       0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     0    --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 DOCKER     0    --  *      docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     0    --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     0    --  docker0 docker0  0.0.0.0/0            0.0.0.0/0           
   41 48479 ACCEPT     0    --  *      br-e97df219409b  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 DOCKER     0    --  *      br-e97df219409b  0.0.0.0/0            0.0.0.0/0           
   27  3732 ACCEPT     0    --  br-e97df219409b !br-e97df219409b  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     0    --  br-e97df219409b br-e97df219409b  0.0.0.0/0            0.0.0.0/0           
   70  7919 ACCEPT     0    --  br-852e6920e44f br-852e6920e44f  0.0.0.0/0            0.0.0.0/0           
  710  485K ACCEPT     0    --  *      br-7828a5046782  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    1    60 DOCKER     0    --  *      br-7828a5046782  0.0.0.0/0            0.0.0.0/0           
  632  171K ACCEPT     0    --  br-7828a5046782 !br-7828a5046782  0.0.0.0/0            0.0.0.0/0           
    1    60 ACCEPT     0    --  br-7828a5046782 br-7828a5046782  0.0.0.0/0            0.0.0.0/0           
  558  226K ACCEPT     0    --  br-3d5e05d68993 br-3d5e05d68993  0.0.0.0/0            0.0.0.0/0           
 3762 1955K ACCEPT     0    --  br-3237097f63b8 br-3237097f63b8  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     0    --  br-f90df49ea7de br-f90df49ea7de  0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER (3 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 ACCEPT     6    --  !br-7828a5046782 br-7828a5046782  0.0.0.0/0            172.21.0.2           tcp dpt:3600
    0     0 ACCEPT     6    --  !br-7828a5046782 br-7828a5046782  0.0.0.0/0            172.21.0.2           tcp dpt:3931
    0     0 ACCEPT     6    --  !br-7828a5046782 br-7828a5046782  0.0.0.0/0            172.21.0.2           tcp dpt:29334
    0     0 ACCEPT     6    --  !br-7828a5046782 br-7828a5046782  0.0.0.0/0            172.21.0.3           tcp dpt:3221
    0     0 ACCEPT     6    --  !br-7828a5046782 br-7828a5046782  0.0.0.0/0            172.21.0.4           tcp dpt:5600
    0     0 ACCEPT     6    --  !br-7828a5046782 br-7828a5046782  0.0.0.0/0            172.21.0.4           tcp dpt:5601

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DOCKER-ISOLATION-STAGE-2  0    --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
   27  3732 DOCKER-ISOLATION-STAGE-2  0    --  br-e97df219409b !br-e97df219409b  0.0.0.0/0            0.0.0.0/0           
    0     0 DROP       0    --  *      br-852e6920e44f !172.22.0.0/16        0.0.0.0/0           
    0     0 DROP       0    --  br-852e6920e44f *       0.0.0.0/0           !172.22.0.0/16       
  632  171K DOCKER-ISOLATION-STAGE-2  0    --  br-7828a5046782 !br-7828a5046782  0.0.0.0/0            0.0.0.0/0           
    0     0 DROP       0    --  *      br-3d5e05d68993 !172.23.0.0/16        0.0.0.0/0           
    0     0 DROP       0    --  br-3d5e05d68993 *       0.0.0.0/0           !172.23.0.0/16       
    0     0 DROP       0    --  *      br-3237097f63b8 !172.18.0.0/16        0.0.0.0/0           
    0     0 DROP       0    --  br-3237097f63b8 *       0.0.0.0/0           !172.18.0.0/16       
    0     0 DROP       0    --  *      br-f90df49ea7de !172.19.0.0/16        0.0.0.0/0           
    0     0 DROP       0    --  br-f90df49ea7de *       0.0.0.0/0           !172.19.0.0/16       
 5801 2899K RETURN     0    --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-ISOLATION-STAGE-2 (3 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DROP       0    --  *      docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 DROP       0    --  *      br-e97df219409b  0.0.0.0/0            0.0.0.0/0           
    0     0 DROP       0    --  *      br-7828a5046782  0.0.0.0/0            0.0.0.0/0           
  659  175K RETURN     0    --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-USER (1 references)
 pkts bytes target     prot opt in     out     source               destination         
 5801 2899K RETURN     0    --  *      *       0.0.0.0/0            0.0.0.0/0           

cheesycod avatar Feb 25 '25 23:02 cheesycod

This persists in 27.5.1 as well

Thank you @cheesycod - I don't see a problem with those rules. They look like 27.x rules, after a flush or reboot (if it's after a downgrade)?

No packets seem to be getting dropped - is the dump from a run where it was failing?

Had it been working on 27.x, and it's now broken following a downgrade - or has it never worked properly in 27.x either?

Could you send the nat table too? (iptables -nvL -t nat).

robmry avatar Feb 26 '25 08:02 robmry

Moby 28.0.1 is available now ... although I can't spot an issue in the iptables dump above, it might be worth a try.

If it still doesn't work - it'd be useful to see iptables -nvL and iptables -nVL -t nat, after making some failed requests (to make sure the packet counters in the iptables dumps have something to show, if iptables is the issue).

robmry avatar Feb 26 '25 14:02 robmry

Same issue with 28.0.1 (Docker version 28.0.1, build 068a01e).

Seems randomly caused by a DinD container service (also Version 28.0.1), which gets started (and removed) from a GitLab CI/CD Pipeline by a gitlab-runner on the same machine..

:~# iptables -nvL
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain FORWARD (policy DROP 7216 packets, 463K bytes)
 pkts bytes target     prot opt in     out     source               destination
  114 18733 DOCKER-USER  0    --  *      *       0.0.0.0/0            0.0.0.0/0
  121 20073 DOCKER-FORWARD  0    --  *      *       0.0.0.0/0            0.0.0.0/0

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain DOCKER (16 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DROP       0    --  !docker0 docker0  0.0.0.0/0            0.0.0.0/0

Chain DOCKER-BRIDGE (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DOCKER     0    --  *      br-65686538b6c3  0.0.0.0/0            0.0.0.0/0
    0     0 DOCKER     0    --  *      br-755b77a90cb0  0.0.0.0/0            0.0.0.0/0
    0     0 DOCKER     0    --  *      br-bb3a8cd93a85  0.0.0.0/0            0.0.0.0/0
    0     0 DOCKER     0    --  *      br-1b7b5c0abfb7  0.0.0.0/0            0.0.0.0/0
    0     0 DOCKER     0    --  *      br-4412a5d2db9b  0.0.0.0/0            0.0.0.0/0
    0     0 DOCKER     0    --  *      br-711e87b1bfcb  0.0.0.0/0            0.0.0.0/0
    0     0 DOCKER     0    --  *      br-97ec4c2cbbd3  0.0.0.0/0            0.0.0.0/0
    0     0 DOCKER     0    --  *      br-99916fd13930  0.0.0.0/0            0.0.0.0/0
16405  981K DOCKER     0    --  *      br-b44f4a949bbc  0.0.0.0/0            0.0.0.0/0
    0     0 DOCKER     0    --  *      br-d8a6fd7457a7  0.0.0.0/0            0.0.0.0/0
    0     0 DOCKER     0    --  *      br-ea9158ec914f  0.0.0.0/0            0.0.0.0/0
    0     0 DOCKER     0    --  *      br-0095e160b80c  0.0.0.0/0            0.0.0.0/0
    0     0 DOCKER     0    --  *      br-cab8f4420925  0.0.0.0/0            0.0.0.0/0
    0     0 DOCKER     0    --  *      br-fdf1de3f3140  0.0.0.0/0            0.0.0.0/0
    0     0 DOCKER     0    --  *      br-30419ed6c2ac  0.0.0.0/0            0.0.0.0/0
    0     0 DOCKER     0    --  *      docker0  0.0.0.0/0            0.0.0.0/0

Chain DOCKER-CT (1 references)
 pkts bytes target     prot opt in     out     source               destination
 9589 1601K ACCEPT     0    --  *      br-65686538b6c3  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     0    --  *      br-755b77a90cb0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
15257   36M ACCEPT     0    --  *      br-bb3a8cd93a85  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
  126  383K ACCEPT     0    --  *      br-1b7b5c0abfb7  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     0    --  *      br-4412a5d2db9b  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     0    --  *      br-711e87b1bfcb  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     0    --  *      br-97ec4c2cbbd3  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     0    --  *      br-99916fd13930  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
 660K   61M ACCEPT     0    --  *      br-b44f4a949bbc  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     0    --  *      br-d8a6fd7457a7  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
13143   10M ACCEPT     0    --  *      br-ea9158ec914f  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     0    --  *      br-0095e160b80c  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     0    --  *      br-cab8f4420925  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     0    --  *      br-fdf1de3f3140  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     0    --  *      br-30419ed6c2ac  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 ACCEPT     0    --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED

Chain DOCKER-FORWARD (1 references)
 pkts bytes target     prot opt in     out     source               destination
  121 20073 DOCKER-CT  0    --  *      *       0.0.0.0/0            0.0.0.0/0
   90 13751 DOCKER-ISOLATION-STAGE-1  0    --  *      *       0.0.0.0/0            0.0.0.0/0
   90 13751 DOCKER-BRIDGE  0    --  *      *       0.0.0.0/0            0.0.0.0/0
    0     0 ACCEPT     0    --  docker0 *       0.0.0.0/0            0.0.0.0/0

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DOCKER-ISOLATION-STAGE-2  0    --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0

Chain DOCKER-ISOLATION-STAGE-2 (1 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DROP       0    --  *      docker0  0.0.0.0/0            0.0.0.0/0

Chain DOCKER-USER (1 references)
 pkts bytes target     prot opt in     out     source               destination
1045K 3755M RETURN     0    --  *      *       0.0.0.0/0            0.0.0.0/0
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
  205 12316 DOCKER     0    --  *      *       0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DOCKER     0    --  *      *       0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 MASQUERADE  0    --  *      !docker0  172.17.0.0/16        0.0.0.0/0
    0     0 MASQUERADE  0    --  *      !br-30419ed6c2ac  192.168.112.0/20     0.0.0.0/0
    0     0 MASQUERADE  0    --  *      !br-fdf1de3f3140  172.22.0.0/16        0.0.0.0/0
    0     0 MASQUERADE  0    --  *      !br-cab8f4420925  172.25.0.0/16        0.0.0.0/0
    0     0 MASQUERADE  0    --  *      !br-0095e160b80c  192.168.160.0/20     0.0.0.0/0
 1062 63720 MASQUERADE  0    --  *      !br-ea9158ec914f  172.24.0.0/16        0.0.0.0/0
    0     0 MASQUERADE  0    --  *      !br-d8a6fd7457a7  192.168.128.0/20     0.0.0.0/0
  229 13740 MASQUERADE  0    --  *      !br-b44f4a949bbc  172.18.0.0/16        0.0.0.0/0
    0     0 MASQUERADE  0    --  *      !br-99916fd13930  192.168.176.0/20     0.0.0.0/0
    0     0 MASQUERADE  0    --  *      !br-97ec4c2cbbd3  192.168.144.0/20     0.0.0.0/0
    0     0 MASQUERADE  0    --  *      !br-711e87b1bfcb  172.29.0.0/16        0.0.0.0/0
    0     0 MASQUERADE  0    --  *      !br-4412a5d2db9b  172.23.0.0/16        0.0.0.0/0
    2   120 MASQUERADE  0    --  *      !br-1b7b5c0abfb7  172.19.0.0/16        0.0.0.0/0
  379 22764 MASQUERADE  0    --  *      !br-bb3a8cd93a85  172.26.0.0/16        0.0.0.0/0
    0     0 MASQUERADE  0    --  *      !br-755b77a90cb0  192.168.96.0/20      0.0.0.0/0
  342 20520 MASQUERADE  0    --  *      !br-65686538b6c3  172.20.0.0/16        0.0.0.0/0
    0     0 MASQUERADE  0    --  *      *       172.21.0.0/16        0.0.0.0/0

Chain DOCKER (2 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 RETURN     0    --  docker0 *       0.0.0.0/0            0.0.0.0/0

TheEvilCoder42 avatar Mar 05 '25 14:03 TheEvilCoder42

Seems to be related to a DinD container service (also Version 28.0.1), which gets started (and removed) from a GitLab CI/CD Pipeline by a gitlab-runner on the same machine..

Hi @TheEvilCoder42 ... so, you have two instances of dockerd running on the same host, or have I misunderstood?

(They'll certainly interfere with each other, I don't think it would have worked reliably with any release?)

robmry avatar Mar 05 '25 14:03 robmry

No there is only one instance of dockerd running.

The machine also runs the gitlab-runner service which pulls CI/CD Pipeline jobs from GitLab and runs them in containers. A DinD service in the Pipeline (another container with the docker:dind image) seems to sporadically trigger the loss of network connection for all containers.

TheEvilCoder42 avatar Mar 05 '25 15:03 TheEvilCoder42

Got it - thank you. Is the DinD container running with --network host?

In the iptables dump, there are rules for a lot of bridge networks in DOCKER-BRIDGE and DOCKER-CT, but not in the DOCKER-ISOLATION chains. On startup the daemon flushes most of the chains - but not those two new ones (which is a bug, https://github.com/moby/moby/pull/49582).

So, it looks like that might have happened to the outer docker's rules.

robmry avatar Mar 05 '25 15:03 robmry

Yes exactly the gitlab-runner is configured to use --network host.

Forget to mention, there are no additional iptables rules, nor persistance and a docker service restart fixes the issue (at least until a build sporadically triggers the issue again)

Also I'm not entirely sure if this could be related to this issue: https://github.com/docker-library/docker/issues/463 The machine runs Debian 12 Bookworm (Linux REDACTED 6.1.0-31-cloud-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.128-1 (2025-02-07) x86_64 GNU/Linux)

TheEvilCoder42 avatar Mar 05 '25 15:03 TheEvilCoder42

Yes exactly the gitlab-runner is configured to use --network host.

Forget to mention, there are no additional iptables rules, nor persistance and a docker service restart fixes the issue (at least until a build sporadically triggers the issue again)

Have been travelling so couldn’t test with 28.0.1 (though I did manage to reproduce it on even 27), in my case the issue occurs purely with bridge networks and no network host

cheesycod avatar Mar 05 '25 15:03 cheesycod

Yes exactly the gitlab-runner is configured to use --network host.

Ok, I don't think that can work. From the networking perspective, it's the same as running two docker daemons on the host, they will interfere with each other.

Restarting the daemon on the host fixes it by re-creating all the rules needed by Docker on the host machine.

I'm wondering if it ended up a bit less broken before 28.0 because some rules we've now moved out of the FORWARD chain wouldn't have been flushed when the DinD daemon started. But, even then, I think network isolation, port mappings, and various things would have stopped working on the host.

Does the runner need to use the host's network?

Also I'm not entirely sure if this could be related to this issue: docker-library/docker#463 The machine runs Debian 12 Bookworm (Linux REDACTED 6.1.0-31-cloud-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.128-1 (2025-02-07) x86_64 GNU/Linux)

I think that's unrelated, to do with mixing nftables and legacy xtables.

robmry avatar Mar 05 '25 15:03 robmry

Have been travelling so couldn’t test with 28.0.1 (though I did manage to reproduce it on even 27), in my case the issue occurs purely with bridge networks and no network host

Thanks @cheesycod - it sounds like these are separate issues. When you have time, it'd be good to collect iptables dumps for the working and not-working states to see what's changed - assuming it's iptables related.

robmry avatar Mar 05 '25 15:03 robmry

Ok, I don't think that can work. From the networking perspective, it's the same as running two docker daemons on the host, they will interfere with each other.

Restarting the daemon on the host fixes it by re-creating all the rules needed by Docker on the host machine.

I'm wondering if it ended up a bit less broken before 28.0 because some rules we've now moved out of the FORWARD chain wouldn't have been flushed when the DinD daemon started. But, even then, I think network isolation, port mappings, and various things would have stopped working on the host.

Does the runner need to use the host's network?

Well that sounds reasonable, it's now configured to use --network bridge, since this runner doesn't strictly need any VPN or DNS foolery and also runs other containers besides being a runner. Seems it was a bit less broken before 28.0, since it worked without any issue so far (despite some warnings comming from the DinD service).

I think that's unrelated, to do with mixing nftables and legacy xtables.

Alright, thanks for your input!

Thank you very much, let's hope this fixed my problem :).

@cheesycod Sorry to have hijacked your issue :).

TheEvilCoder42 avatar Mar 05 '25 16:03 TheEvilCoder42

Thank you very much, let's hope this fixed my problem :).

Thanks @TheEvilCoder42 ... if it's not fixed, please do raise a new issue.

robmry avatar Mar 05 '25 16:03 robmry

Have been travelling so couldn’t test with 28.0.1 (though I did manage to reproduce it on even 27), in my case the issue occurs purely with bridge networks and no network host

Thanks @cheesycod - it sounds like these are separate issues. When you have time, it'd be good to collect iptables dumps for the working and not-working states to see what's changed - assuming it's iptables related.

Alright, update on this issue, seems like things magically decided to start becoming more reliable all of a sudden with no change on my part. Will update again once more once I manage to reproduce this bug again (it'll probably resurface tomorrow after a few more service restarts for updates, I love random race conditions and conflicting stuff)

cheesycod avatar Mar 05 '25 22:03 cheesycod

Thanks @cheesycod.

robmry avatar Mar 05 '25 22:03 robmry