netavark icon indicating copy to clipboard operation
netavark copied to clipboard

Lost inbound network connectivity after podman-auto-update

Open kanru opened this issue 11 months ago • 6 comments

It might be an podman issue, but the error is related to nftables rule set so I'm opening it here.

Since two days ago, I think there's a new version of Caddy so podman-auto-update pulls the new version and tried to restart the container. This triggered the error

Jan 09 00:07:09 load-balancer[26324]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 09 00:07:09 load-balancer[26324]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 09 00:07:09 load-balancer[26324]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 09 00:07:09 load-balancer[26324]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 09 00:07:09 systemd[1]: run-netns-netns\x2dcddd32e3\x2d3970\x2d4cb6\x2d1012\x2d3f8cb9ffc61a.mount: Deactivated successfully.
Jan 09 00:07:09 systemd[1]: var-lib-containers-storage-overlay\x2dcontainers-dcebc518ea669a309e70da0793585f60232b7f6b69e2b611a76532068394dbbc-userdata-shm.mount: Deactivated successfully.
Jan 09 00:07:09 systemd[1]: var-lib-containers-storage-overlay-8e134e8e755db12cf96ce3266cadaf37bb5e4a25ff40f22464d16802905958fa-merged.mount: Deactivated successfully.
Jan 09 00:07:09 audit: BPF prog-id=87 op=UNLOAD
Jan 09 00:07:09 podman[26293]: 2025-01-09 00:07:09.257042473 +0000 UTC m=+10.194886602 container remove dcebc518ea669a309e70da0793585f60232b7f6b69e2b611a76532068394dbbc (image=docker.io/library/caddy:latest, name=systemd-load-balancer, io.containers.autoupdate=registry, org.opencontainers.image.title=Caddy, org.ope>
Jan 09 00:07:09 load-balancer[26293]: Error: cleaning up container dcebc518ea669a309e70da0793585f60232b7f6b69e2b611a76532068394dbbc: removing container dcebc518ea669a309e70da0793585f60232b7f6b69e2b611a76532068394dbbc network: netavark: netavark encountered multiple errors:
Jan 09 00:07:09 load-balancer[26293]:         - nftables error: nft did not return successfully while applying ruleset
Jan 09 00:07:09 load-balancer[26293]:         - nftables error: nft did not return successfully while applying ruleset

Then after the container restarted, all containers inbound traffic were dropped.

Operating System: Fedora CoreOS 41.20241215.3.0

Podman info

host:
  arch: amd64
  buildahVersion: 1.38.0
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.12-3.fc41.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.12, commit: '
  cpuUtilization:
    idlePercent: 99.27
    systemPercent: 0.22
    userPercent: 0.51
  cpus: 6
  databaseBackend: boltdb
  distribution:
    distribution: fedora
    variant: coreos
    version: "41"
  eventLogger: journald
  freeLocks: 2039
  hostname: qwty8.kanru.info
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 6.11.11-300.fc41.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 21543694336
  memTotal: 25191452672
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.13.1-1.fc41.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.13.1
    package: netavark-1.13.1-1.fc41.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.13.1
  ociRuntime:
    name: crun
    package: crun-1.19-1.fc41.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.19
      commit: db31c42ac46e20b5527f5339dcbf6f023fcd539c
      rundir: /run/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20241211.g09478d5-1.fc41.x86_64
    version: |
      pasta 0^20241211.g09478d5-1.fc41.x86_64
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.3.1-1.fc41.x86_64
    version: |-
      slirp4netns version 1.3.1
      commit: e5e368c4f5db6ae75c2fce786e31eef9da6bf236
      libslirp: 4.8.0
      SLIRP_CONFIG_VERSION_MAX: 5
      libseccomp: 2.5.5
  swapFree: 4294963200
  swapTotal: 4294963200
  uptime: 11h 40m 36.00s (Approximately 0.46 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
store:
  configFile: /usr/share/containers/storage.conf
  containerStore:
    number: 5
    paused: 0
    running: 5
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.imagestore: /usr/lib/containers/storage
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 99387158528
  graphRootUsed: 50946629632
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Supports shifting: "true"
    Supports volatile: "true"
    Using metacopy: "true"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 42
  runRoot: /run/containers/storage
  transientStore: false
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 5.3.1
  Built: 1732147200
  BuiltTime: Thu Nov 21 00:00:00 2024
  GitCommit: ""
  GoVersion: go1.23.3
  Os: linux
  OsArch: linux/amd64
  Version: 5.3.1

Full related logs

Jan 10 00:03:13 systemd[1]: Starting podman-auto-update.service - Podman auto-update service...
Jan 10 00:03:13 podman[17275]: 2025-01-10 00:03:13.700872552 +0000 UTC m=+0.044785826 system auto-update
Jan 10 00:03:18 podman[17275]: Trying to pull docker.io/library/caddy:latest...
Jan 10 00:03:20 podman[17275]: Getting image source signatures
Jan 10 00:03:20 podman[17275]: Copying blob sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1
Jan 10 00:03:20 podman[17275]: Copying blob sha256:9fb01e15d1864a38be41f49f39a622b31434210052f931f9e9197363f1da47b2
Jan 10 00:03:20 podman[17275]: Copying blob sha256:638a7610da93c8e5c2485069dfc98b710644ee74a110786155db6ae7f3aab06a
Jan 10 00:03:20 podman[17275]: Copying blob sha256:66a3d608f3fa52124f8463e9467f170c784abd549e8216aa45c6960b00b4b79b
Jan 10 00:03:20 podman[17275]: Copying blob sha256:785d93e6656c9f15f1630bc96799a8bf04b324e1156ff8441c617fc0d9ad8017
Jan 10 00:03:21 podman[17275]: Copying config sha256:1b7d0a82297a0ab4d6f0a855f790d0208a975163c02e307c171de9674d23a4a0
Jan 10 00:03:21 podman[17275]: Writing manifest to image destination
Jan 10 00:03:21 systemd[1]: Stopping load-balancer.service - Caddy as Load Balancer...
Jan 10 00:03:21 podman[17275]: 2025-01-10 00:03:21.851402688 +0000 UTC m=+8.195315982 image pull 1b7d0a82297a0ab4d6f0a855f790d0208a975163c02e307c171de9674d23a4a0 docker.io/library/caddy
Jan 10 00:03:21 systemd-load-balancer[1951]: {"level":"info","ts":1736467401.927142,"msg":"shutting down apps, then terminating","signal":"SIGTERM"}
Jan 10 00:03:21 systemd-load-balancer[1951]: {"level":"warn","ts":1736467401.9271975,"msg":"exiting; byeee!! 👋","signal":"SIGTERM"}
Jan 10 00:03:21 systemd-load-balancer[1951]: {"level":"info","ts":1736467401.9273078,"logger":"http","msg":"servers shutting down with eternal grace period"}
Jan 10 00:03:31 load-balancer[17352]: time="2025-01-10T00:03:31Z" level=warning msg="StopSignal SIGTERM failed to stop container systemd-load-balancer in 10 seconds, resorting to SIGKILL"
Jan 10 00:03:31 podman[17352]: 2025-01-10 00:03:31.94344011 +0000 UTC m=+10.061971295 container died ddc37145ef0449e05a7e81ec35ac5746207deaea5ac4f1d02c52448a27b6a8ba (image=docker.io/library/caddy:latest, name=systemd-load-balancer, org.opencontainers.image.documentation=https://caddyserver.com/docs, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.url=https://caddyserver.com, org.opencontainers.image.description=a powerful, enterprise-ready, open source web server with automatic HTTPS written in Go, io.containers.autoupdate=registry, org.opencontainers.image.source=https://github.com/caddyserver/caddy-docker, org.opencontainers.image.title=Caddy, org.opencontainers.image.vendor=Light Code Labs, org.opencontainers.image.version=v2.8.4, PODMAN_SYSTEMD_UNIT=load-balancer.service)
Jan 10 00:03:31 kernel: podman2: port 2(veth3) entered disabled state
Jan 10 00:03:31 audit: ANOM_PROMISCUOUS dev=veth3 prom=0 old_prom=256 auid=4294967295 uid=0 gid=0 ses=4294967295
Jan 10 00:03:31 kernel: veth3 (unregistering): left allmulticast mode
Jan 10 00:03:31 kernel: veth3 (unregistering): left promiscuous mode
Jan 10 00:03:31 kernel: podman2: port 2(veth3) entered disabled state
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17387]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 audit: ANOM_PROMISCUOUS dev=veth2 prom=0 old_prom=256 auid=4294967295 uid=0 gid=0 ses=4294967295
Jan 10 00:03:32 kernel: podman1: port 2(veth2) entered disabled state
Jan 10 00:03:32 kernel: veth2 (unregistering): left allmulticast mode
Jan 10 00:03:32 kernel: veth2 (unregistering): left promiscuous mode
Jan 10 00:03:32 kernel: podman1: port 2(veth2) entered disabled state
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 load-balancer[17389]: internal:0:0-0: Error: Could not process rule: No such file or directory
Jan 10 00:03:32 systemd[1]: run-netns-netns\x2d95e08078\x2db49d\x2dbe29\x2d1006\x2dca4d40b596d6.mount: Deactivated successfully.
Jan 10 00:03:32 systemd[1]: var-lib-containers-storage-overlay\x2dcontainers-ddc37145ef0449e05a7e81ec35ac5746207deaea5ac4f1d02c52448a27b6a8ba-userdata-shm.mount: Deactivated successfully.
Jan 10 00:03:32 systemd[1]: var-lib-containers-storage-overlay-4858c3a872c773aa37b441f9e2a6cc92e4d8a8d376176e71ca76d1ec2c203aec-merged.mount: Deactivated successfully.
Jan 10 00:03:32 audit: BPF prog-id=86 op=UNLOAD
Jan 10 00:03:32 podman[17352]: 2025-01-10 00:03:32.099220585 +0000 UTC m=+10.217751759 container remove ddc37145ef0449e05a7e81ec35ac5746207deaea5ac4f1d02c52448a27b6a8ba (image=docker.io/library/caddy:latest, name=systemd-load-balancer, org.opencontainers.image.vendor=Light Code Labs, org.opencontainers.image.version=v2.8.4, org.opencontainers.image.source=https://github.com/caddyserver/caddy-docker, org.opencontainers.image.url=https://caddyserver.com, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.title=Caddy, PODMAN_SYSTEMD_UNIT=load-balancer.service, org.opencontainers.image.documentation=https://caddyserver.com/docs, io.containers.autoupdate=registry, org.opencontainers.image.description=a powerful, enterprise-ready, open source web server with automatic HTTPS written in Go)
Jan 10 00:03:32 load-balancer[17352]: Error: cleaning up container ddc37145ef0449e05a7e81ec35ac5746207deaea5ac4f1d02c52448a27b6a8ba: removing container ddc37145ef0449e05a7e81ec35ac5746207deaea5ac4f1d02c52448a27b6a8ba network: netavark: netavark encountered multiple errors:
Jan 10 00:03:32 load-balancer[17352]:         - nftables error: nft did not return successfully while applying ruleset
Jan 10 00:03:32 load-balancer[17352]:         - nftables error: nft did not return successfully while applying ruleset
Jan 10 00:03:32 systemd[1]: load-balancer.service: Control process exited, code=exited, status=125/n/a
Jan 10 00:03:32 systemd[1]: load-balancer.service: Main process exited, code=exited, status=137/n/a
Jan 10 00:03:32 systemd[1]: load-balancer.service: Failed with result 'exit-code'.
Jan 10 00:03:32 systemd[1]: Stopped load-balancer.service - Caddy as Load Balancer.
Jan 10 00:03:32 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=load-balancer comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Jan 10 00:03:32 systemd[1]: load-balancer.service: Consumed 4min 5.981s CPU time, 79.3M memory peak.
Jan 10 00:03:32 systemd[1]: Starting load-balancer.service - Caddy as Load Balancer...
Jan 10 00:03:32 podman[17405]: 2025-01-10 00:03:32.259788569 +0000 UTC m=+0.049279552 container create 3ec1647e7973167eb2d3760ac193a59664d7374e66ab867d49675b62c4c4975f (image=docker.io/library/caddy:latest, name=systemd-load-balancer, PODMAN_SYSTEMD_UNIT=load-balancer.service, org.opencontainers.image.vendor=Light Code Labs, org.opencontainers.image.licenses=Apache-2.0, io.containers.autoupdate=registry, org.opencontainers.image.url=https://caddyserver.com, org.opencontainers.image.description=a powerful, enterprise-ready, open source web server with automatic HTTPS written in Go, org.opencontainers.image.documentation=https://caddyserver.com/docs, org.opencontainers.image.version=v2.9.1, org.opencontainers.image.source=https://github.com/caddyserver/caddy-docker, org.opencontainers.image.title=Caddy)
Jan 10 00:03:32 audit: ANOM_PROMISCUOUS dev=veth2 prom=256 old_prom=0 auid=4294967295 uid=0 gid=0 ses=4294967295
Jan 10 00:03:32 kernel: podman2: port 2(veth2) entered blocking state
Jan 10 00:03:32 kernel: podman2: port 2(veth2) entered disabled state
Jan 10 00:03:32 kernel: veth2: entered allmulticast mode
Jan 10 00:03:32 kernel: veth2: entered promiscuous mode
Jan 10 00:03:32 kernel: podman2: port 2(veth2) entered blocking state
Jan 10 00:03:32 kernel: podman2: port 2(veth2) entered forwarding state
Jan 10 00:03:32 NetworkManager[1253]: <info>  [1736467412.2743] manager: (veth2): new Veth device (/org/freedesktop/NetworkManager/Devices/17)
Jan 10 00:03:32 NetworkManager[1253]: <info>  [1736467412.2759] device (veth2): carrier: link connected
Jan 10 00:03:32 audit[17432]: NETFILTER_CFG table=netavark:15 family=1 entries=10 op=nft_register_chain pid=17432 subj=system_u:system_r:iptables_t:s0 comm="nft"
Jan 10 00:03:32 audit[17438]: NETFILTER_CFG table=netavark:16 family=1 entries=50 op=nft_register_rule pid=17438 subj=system_u:system_r:iptables_t:s0 comm="nft"
Jan 10 00:03:32 audit: ANOM_PROMISCUOUS dev=veth3 prom=256 old_prom=0 auid=4294967295 uid=0 gid=0 ses=4294967295
Jan 10 00:03:32 kernel: podman1: port 2(veth3) entered blocking state
Jan 10 00:03:32 kernel: podman1: port 2(veth3) entered disabled state
Jan 10 00:03:32 kernel: veth3: entered allmulticast mode
Jan 10 00:03:32 kernel: veth3: entered promiscuous mode
Jan 10 00:03:32 NetworkManager[1253]: <info>  [1736467412.3317] manager: (veth3): new Veth device (/org/freedesktop/NetworkManager/Devices/18)
Jan 10 00:03:32 kernel: podman1: port 2(veth3) entered blocking state
Jan 10 00:03:32 kernel: podman1: port 2(veth3) entered forwarding state
Jan 10 00:03:32 NetworkManager[1253]: <info>  [1736467412.3334] device (veth3): carrier: link connected
Jan 10 00:03:32 podman[17405]: 2025-01-10 00:03:32.245000825 +0000 UTC m=+0.034491818 image pull 1b7d0a82297a0ab4d6f0a855f790d0208a975163c02e307c171de9674d23a4a0 docker.io/library/caddy
Jan 10 00:03:32 audit[17443]: NETFILTER_CFG table=netavark:17 family=1 entries=10 op=nft_register_chain pid=17443 subj=system_u:system_r:iptables_t:s0 comm="nft"
Jan 10 00:03:32 audit[17445]: NETFILTER_CFG table=netavark:18 family=1 entries=50 op=nft_register_rule pid=17445 subj=system_u:system_r:iptables_t:s0 comm="nft"
Jan 10 00:03:32 audit: BPF prog-id=112 op=LOAD
Jan 10 00:03:32 podman[17405]: 2025-01-10 00:03:32.440096314 +0000 UTC m=+0.229587296 container init 3ec1647e7973167eb2d3760ac193a59664d7374e66ab867d49675b62c4c4975f (image=docker.io/library/caddy:latest, name=systemd-load-balancer, io.containers.autoupdate=registry, org.opencontainers.image.vendor=Light Code Labs, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.documentation=https://caddyserver.com/docs, org.opencontainers.image.title=Caddy, org.opencontainers.image.url=https://caddyserver.com, org.opencontainers.image.description=a powerful, enterprise-ready, open source web server with automatic HTTPS written in Go, org.opencontainers.image.source=https://github.com/caddyserver/caddy-docker, org.opencontainers.image.version=v2.9.1, PODMAN_SYSTEMD_UNIT=load-balancer.service)
Jan 10 00:03:32 podman[17405]: 2025-01-10 00:03:32.443308985 +0000 UTC m=+0.232799978 container start 3ec1647e7973167eb2d3760ac193a59664d7374e66ab867d49675b62c4c4975f (image=docker.io/library/caddy:latest, name=systemd-load-balancer, org.opencontainers.image.source=https://github.com/caddyserver/caddy-docker, org.opencontainers.image.version=v2.9.1, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.url=https://caddyserver.com, org.opencontainers.image.documentation=https://caddyserver.com/docs, PODMAN_SYSTEMD_UNIT=load-balancer.service, org.opencontainers.image.title=Caddy, org.opencontainers.image.description=a powerful, enterprise-ready, open source web server with automatic HTTPS written in Go, io.containers.autoupdate=registry, org.opencontainers.image.vendor=Light Code Labs)
Jan 10 00:03:32 systemd[1]: Started load-balancer.service - Caddy as Load Balancer.
Jan 10 00:03:32 load-balancer[17405]: 3ec1647e7973167eb2d3760ac193a59664d7374e66ab867d49675b62c4c4975f
Jan 10 00:03:32 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=load-balancer comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"info","ts":1736467412.5040653,"msg":"using config from file","file":"/etc/caddy/Caddyfile"}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"info","ts":1736467412.5069592,"msg":"adapted config to JSON","adapter":"caddyfile"}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"warn","ts":1736467412.5069861,"msg":"Caddyfile input is not formatted; run 'caddy fmt --overwrite' to fix inconsistencies","adapter":"caddyfile","file":"/etc/caddy/Caddyfile","line":2}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"info","ts":1736467412.5096412,"logger":"admin","msg":"admin endpoint started","address":"localhost:2019","enforce_origin":false,"origins":["//localhost:2019","//[::1]:2019","//127.0.0.1:2019"]}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"info","ts":1736467412.5099578,"logger":"http.auto_https","msg":"server is listening only on the HTTPS port but has no TLS connection policies; adding one to enable TLS","server_name":"srv0","https_port":443}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"info","ts":1736467412.5099804,"logger":"http.auto_https","msg":"enabling automatic HTTP->HTTPS redirects","server_name":"srv0"}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"info","ts":1736467412.5099936,"logger":"http.auto_https","msg":"enabling automatic HTTP->HTTPS redirects","server_name":"srv1"}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"info","ts":1736467412.5100334,"logger":"tls.cache.maintenance","msg":"started background certificate maintenance","cache":"0xc00054c680"}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"info","ts":1736467412.5126288,"logger":"http","msg":"enabling HTTP/3 listener","addr":":443"}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"info","ts":1736467412.512765,"msg":"failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 7168 kiB, got: 416 kiB). See https://github.com/quic-go/quic-go/wiki/UDP-Buffer-Sizes for details."}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"info","ts":1736467412.5129712,"logger":"http.log","msg":"server running","name":"srv0","protocols":["h1","h2","h3"]}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"info","ts":1736467412.5130284,"logger":"http","msg":"enabling HTTP/3 listener","addr":":8448"}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"info","ts":1736467412.5136302,"logger":"http.log","msg":"server running","name":"srv1","protocols":["h1","h2","h3"]}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"warn","ts":1736467412.5137088,"logger":"http","msg":"HTTP/2 skipped because it requires TLS","network":"tcp","addr":":80"}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"warn","ts":1736467412.5137196,"logger":"http","msg":"HTTP/3 skipped because it requires TLS","network":"tcp","addr":":80"}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"info","ts":1736467412.5137312,"logger":"http.log","msg":"server running","name":"remaining_auto_https_redirects","protocols":["h1","h2","h3"]}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"info","ts":1736467412.5163968,"logger":"tls","msg":"storage cleaning happened too recently; skipping for now","storage":"FileStorage:/data/caddy","instance":"313687f4-8dea-4511-ae5c-d0a6833b8783","try_again":1736553812.516393,"try_again_in":86399.999999669}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"info","ts":1736467412.516543,"logger":"tls","msg":"finished cleaning storage units"}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"info","ts":1736467412.5187922,"msg":"autosaved config (load with --resume flag)","file":"/config/caddy/autosave.json"}
Jan 10 00:03:32 systemd-load-balancer[17447]: {"level":"info","ts":1736467412.5188122,"msg":"serving initial configuration"}
...
Jan 10 00:03:33 podman[17463]: 0bb2a2a4ce822f03640855a09b25143dd046939b4fc54ca0cbda523eb2da9870
Jan 10 00:03:33 podman[17463]: 2025-01-10 00:03:33.895057766 +0000 UTC m=+0.047121982 image remove 0bb2a2a4ce822f03640855a09b25143dd046939b4fc54ca0cbda523eb2da9870 
Jan 10 00:03:33 systemd[1]: podman-auto-update.service: Deactivated successfully.
Jan 10 00:03:33 systemd[1]: Finished podman-auto-update.service - Podman auto-update service.
Jan 10 00:03:33 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=podman-auto-update comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jan 10 00:03:33 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=podman-auto-update comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jan 10 00:03:33 systemd[1]: podman-auto-update.service: Consumed 1.696s CPU time, 96.1M memory peak.

kanru avatar Jan 10 '25 12:01 kanru

The auto update is not really relevant it just restart the container.

Please provide a proper reproducer how is the container created. What network options/ports, etc...?

Before you try to reproduce make sure to reboot (or flush out all firewall rules) to make sure you start from a clean state.

Luap99 avatar Jan 10 '25 12:01 Luap99

The container was created using quadlet. It joins two networks. I can reproduce it reliably on this server even after reboot. I'm not sure if I can reproduce it on another machine. Is there any state info I can fetch for you?

[Unit]
Description=Caddy as Load Balancer
After=network-online.target
Wants=network-online.target

[Container]
Image=docker.io/library/caddy
AutoUpdate=registry
Network=core.network
Network=matrix.network
Volume=/etc/caddy/Caddyfile:/etc/caddy/Caddyfile
Volume=caddy_data:/data
PublishPort=80:80
PublishPort=443:443/tcp
PublishPort=443:443/udp
PublishPort=8448:8448/tcp
PublishPort=8448:8448/udp
PublishPort=[::]:80:80
PublishPort=[::]:443:443/tcp
PublishPort=[::]:443:443/udp
PublishPort=[::]:8448:8448/tcp
PublishPort=[::]:8448:8448/udp

[Service]
Restart=always
TimeoutStartSec=900

[Install]
WantedBy=multi-user.target

kanru avatar Jan 10 '25 12:01 kanru

I assume your networks are ipv6 and ipv4?

PublishPort=[::]:80:80 PublishPort=[::]:443:443/tcp PublishPort=[::]:443:443/udp PublishPort=[::]:8448:8448/tcp PublishPort=[::]:8448:8448/udp

To work around you can remove this, it is really not needed. By default it will forward ipv4 and v6.

That is a short reproducer

sudo podman network create --disable-dns --ipv6  n1
sudo podman network create --disable-dns --ipv6  n2
sudo podman run --network n1 --network n2 -p 8080:8080/tcp -p [::]:8080:8080/tcp  quay.io/libpod/testimage:20241011 ip a

Then sudo nft list table inet netavark to show the leaks.

Luap99 avatar Jan 10 '25 13:01 Luap99

Yes, the two networks has IPv6 enabled. I guess I was trying to make sure I can connect via IPv6.

The workaround did the trick. Now the container restarts correctly and inbound network still works.

kanru avatar Jan 10 '25 13:01 kanru

@mheon FYI, another case of duplicated nftables rules causing troubles.

Luap99 avatar Jan 10 '25 17:01 Luap99

I have had the same issue a few times while updating systems from Fedora CoreOS 41.20250117.3.0 to 41.20250130.3.0, but it only appears once after the update and then never again (a reboot fixes it, at least so far) which is very weird because we do not persist most of the system state. For us, not only is inbound access not possible anymore ("no route to host"), but also the DNS lookup for that container's name did return two IPs. One wrong old one and a new working one. It seems like a container which was stopped and removed was not properly stopped because its ID was still in /run/containers/networks/aardvark-dns/<our-network>. Removing it caused the connectivity between containers to work again. The internal IP of the container works, also from the host. Maybe the "forwarding" that listens to the incoming traffic on the host did forward it to the old IP of the old container, which would explain the "no route to host" error message. Sorry, I can't really put more logs here because I do not have access to them anymore.

profiluefter avatar Mar 11 '25 18:03 profiluefter

I am also experiencing this issue caused by duplicated dnats, running 1.14.0-2 on debian. I too was publishing ports for both ipv4 and ipv6 (-p 9117:9117 -p [::]:9117:9117). Removing the separate ipv6 publish appears to resolve the issue for me as well.

Mar 31 00:06:09 podman podman[57933]: internal:0:0-0: Error: Could not process rule: No such file or directory
Mar 31 00:06:09 podman podman[57933]: internal:0:0-0: Error: Could not process rule: No such file or directory
Mar 31 00:06:09 podman podman[57933]: internal:0:0-0: Error: Could not process rule: No such file or directory
Mar 31 00:06:09 podman podman[57933]: internal:0:0-0: Error: Could not process rule: No such file or directory
Mar 31 00:06:09 podman podman[57933]: internal:0:0-0: Error: Could not process rule: No such file or directory
Mar 31 00:06:09 podman podman[57933]: internal:0:0-0: Error: Could not process rule: No such file or directory
Mar 31 00:06:10 podman podman[57795]: Error: cleaning up container 2c3ee6a5dd9e8deec983d4d9b5c3a4418d225f2c391ec94aabec83f2bd7f501e: removing container 2c3ee6a5dd9e8deec983d4d9b5c3a4418d225f2c391ec94aabec83f2bd7f501e network: netavark: nftables error: nft did not return successfully while applying ruleset
Mar 31 00:06:10 podman systemd[1]: container-jackett.service: Control process exited, code=exited, status=125/n/a

Here's the state after 1 container restart with the ipv6 publish still present and another container restart with the ipv6 publish removed:

# nft list table inet netavark | grep 9117
        tcp dport 9117 jump nv_2f259bab_10_88_0_0_nm16_dnat
        tcp dport 9117 jump nv_2f259bab_fd1a-e3a5-837b-b993--_nm64_dnat
        tcp dport 9117 dnat ip to 10.88.0.11:9117
        tcp dport 9117 dnat ip to 10.88.0.15:9117
        ip saddr 10.88.0.0/16 tcp dport 9117 jump NETAVARK-HOSTPORT-SETMARK
        ip saddr 127.0.0.1 tcp dport 9117 jump NETAVARK-HOSTPORT-SETMARK
        tcp dport 9117 dnat ip to 10.88.0.18:9117
        tcp dport 9117 dnat ip6 to [fd1a:e3a5:837b:b993::b]:9117
        tcp dport 9117 dnat ip6 to [fd1a:e3a5:837b:b993::b]:9117
        tcp dport 9117 dnat ip6 to [fd1a:e3a5:837b:b993::f]:9117
        tcp dport 9117 dnat ip6 to [fd1a:e3a5:837b:b993::f]:9117
        ip6 saddr fd1a:e3a5:837b:b993::/64 tcp dport 9117 jump NETAVARK-HOSTPORT-SETMARK
        tcp dport 9117 dnat ip6 to [fd1a:e3a5:837b:b993::12]:9117

llamaonaskateboard avatar Mar 31 '25 10:03 llamaonaskateboard

Hit the duplicated dnats too here when restarting the network (podman + netavark + nftables on openSUSE OS Micro).

Bischoff avatar Apr 02 '25 07:04 Bischoff

This issue is trivially reproducible and affects systems with or without firewalld. It still affects netavark-1.15

/usr/local/bin/test-netavark-service:

#!/bin/bash

set -e
set -x

podman run --rm -it --name test-netavark --network ctr-network -p 9999:9999 registry.fedoraproject.org/fedora-toolbox:42 sleep +Inf

exit 0

/etc/systemd/system/test-netavark.service:

[Unit]
Decription=netavark test

[Service]
ExecStartPre=-podman stop test-netavark
ExecStartPre=-podman rm test-netavark
ExecStart=/usr/local/bin/test-netavark-service
ExecStop=podman kill test-netavark
ExecStopPost=-podman rm test-netavark

At the Cli:

systemctl daemon-reload

systemctl start test-netavark.service
nft list table inet netavark | grep 9999
[you should see one IPv4 dnat entry for port 9999]

systemctl restart test-netavark.service
nft list table inet netavark | grep 9999

systemctl restart test-netavark.service
nft list table inet netavark | grep 9999
...

Do that enough times and you will see duplicate DNAT entries from previous IP bindings of the container. Anecdotally I see that podman network reload --all can clear out some of the duplicate entries.

carlroth avatar Jun 27 '25 16:06 carlroth

@carlroth That is simply not a "supported" systemd unit. systemd by default will just kill our clean process and likely tracks the wrong main pid. In this case network cleanup cannot reliably work.

We recommend using quadlets https://docs.podman.io/en/latest/markdown/podman-systemd.unit.5.html which creates a well tested service.

Luap99 avatar Jun 27 '25 16:06 Luap99

This does seem to work better:

/etc/containers/systemd/test-netavark-quadlet.container:

[Unit]
Description=Test netavark via a Quadlet

[Container]
Image=registry.fedoraproject.org/fedora-toolbox:42
Exec=sleep +Inf
Network=test-netavark-quadlet.network
PublishPort=9999:9999

/etc/containers/systemd/test-netavark-quadlet.network:

[Network]
IPv6=true
NetworkName=test-netavark-quadlet

How are quadlets supposed to work with computed container options (ones that are only available after generators run)?

carlroth avatar Jun 27 '25 17:06 carlroth

generators are run every time on systemctl daemon-reload so it would be the same as if you create services directly in that regard. systemd template units are also supported but templates are somewhat limted.

In any case this is not related to this issue here. If you have specific quadlet question ask them on the podman repo https://github.com/containers/podman/discussions

Luap99 avatar Jun 30 '25 09:06 Luap99