netavark icon indicating copy to clipboard operation
netavark copied to clipboard

Podman and Docker IPv6 Compatibility Differs

Open shawnweeks opened this issue 4 years ago • 49 comments

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

Steps to reproduce the issue:

  1. Install CentOS 8 with IPv6 Enabled

  2. Install latest PodMan

  3. Attempt to start Keycloak instance

podman run --rm -p 8080:8080 -e KEYCLOAK_USER=admin -e KEYCLOAK_PASSWORD=admin jboss/keycloak

Describe the results you received:

15:04:20,205 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-4) MSC000001: Failed to start service org.wildfly.network.interface.private: org.jboss.msc.service.StartException in service org.wildfly.network.interface.private: WFLYSRV0082: failed to resolve interface private
	at org.jboss.as.server.services.net.NetworkInterfaceService.start(NetworkInterfaceService.java:96)
	at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1738)
	at org.jboss.msc.service.ServiceControllerImpl$StartTask.execute(ServiceControllerImpl.java:1700)
	at org.jboss.msc.service.ServiceControllerImpl$ControllerTask.run(ServiceControllerImpl.java:1558)
	at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
	at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1982)
	at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
	at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1363)
	at java.lang.Thread.run(Thread.java:748)

15:04:20,234 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0013: Operation ("add") failed - address: ([("interface" => "private")]) - failure description: {"WFLYCTL0080: Failed services" => {"org.wildfly.network.interface.private" => "WFLYSRV0082: failed to resolve interface private"}}

Describe the results you expected:

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

[centos@cloudctl1 ~]$ podman --version
podman version 1.8.0
[centos@cloudctl1 ~]$ podman info --debug
debug:
  compiler: gc
  git commit: ""
  go version: go1.12.12
  podman version: 1.8.0
host:
  BuildahVersion: 1.13.1
  CgroupVersion: v1
  Conmon:
    package: conmon-2.0.6-1.module_el8.1.0+272+3e64ee36.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.6, commit: 7a4f0dd7b20a3d4bf9ef3e5cbfac05606b08eac0'
  Distribution:
    distribution: '"centos"'
    version: "8"
  IDMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  MemFree: 64605507584
  MemTotal: 67204878336
  OCIRuntime:
    name: runc
    package: runc-1.0.0-64.rc9.module_el8.1.0+272+3e64ee36.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  SwapFree: 33806086144
  SwapTotal: 33806086144
  arch: amd64
  cpus: 24
  eventlogger: journald
  hostname: cloudctl1.dev.example.com
  kernel: 4.18.0-147.5.1.el8_1.x86_64
  os: linux
  rootless: true
  slirp4netns:
    Executable: /usr/bin/slirp4netns
    Package: slirp4netns-0.4.2-2.git21fdece.module_el8.1.0+272+3e64ee36.x86_64
    Version: |-
      slirp4netns version 0.4.2+dev
      commit: 21fdece2737dc24ffa3f01a341b8a6854f8b13b4
  uptime: 20m 14.25s
registries:
  search:
  - registry.access.redhat.com
  - registry.fedoraproject.org
  - registry.centos.org
  - docker.io
store:
  ConfigFile: /home/centos/.config/containers/storage.conf
  ContainerStore:
    number: 1
  GraphDriverName: overlay
  GraphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-0.7.2-1.module_el8.1.0+272+3e64ee36.x86_64
      Version: |-
        fuse-overlayfs: version 0.7.2
        FUSE library version 3.2.1
        using FUSE kernel interface version 7.26
  GraphRoot: /home/centos/.local/share/containers/storage
  GraphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 1
  RunRoot: /run/user/1000
  VolumePath: /home/centos/.local/share/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

podman-1.8.0-3.1.el8.x86_64

Additional environment details (AWS, VirtualBox, physical, etc.):

This works fine on Ubuntu 18.04 with docker-io version 18.09.7, build 2d0083d and IPv6 enabled

shawnweeks avatar Mar 05 '20 15:03 shawnweeks

https://unix.stackexchange.com/questions/566812/keycloak-failing-to-start-with-failed-to-resolve-interface-private/

shawnweeks avatar Mar 05 '20 15:03 shawnweeks

Can you provide more details on the failure here? I don't think any of our developers are particularly familiar with Tomcat, so more details on what's going wrong would make this much easier.

Is Tomcat trying to bind to an IPv6 address inside the container?

mheon avatar Mar 05 '20 15:03 mheon

It's Wildfly not Tomcat but I'm not exactly sure what the failure is. If I disable IPv6 on CentOS 8 in Grub it goes back to working. I'm trying to narrow it down now.

shawnweeks avatar Mar 05 '20 15:03 shawnweeks

Realized I wasn't testing against the same versions. The issue seems to be that Podman presents an IPv6 interface inside the container that might not actually allow binding and Docker does not.

This is docker based

[root@b21d2f5618f4 /]# cat /proc/net/if_inet6
[root@b21d2f5618f4 /]#

This is podman based

[root@05498918caa8 /]# cat /proc/net/if_inet6
00000000000000000000000000000001 01 80 10 80       lo
fd00000000000000200cd3fffebd9434 02 40 00 00     tap0
fe80000000000000200cd3fffebd9434 02 40 20 80     tap0
[root@05498918caa8 /]#

shawnweeks avatar Mar 05 '20 15:03 shawnweeks

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Apr 05 '20 00:04 github-actions[bot]

@shawnweeks @mheon Did we come to a conclusion that what Podman is doing is wrong? Or just different then Docker?

rhatdan avatar Apr 06 '20 18:04 rhatdan

I'm not sure, Docker disables IPv6 by default and in Podman it's enabled and doesn't appear to work with things like WildFly App Server. I'm not sure the IPv6 is actually broken it might just be something about how WildFly tries to use it. WildFly runs fine with IPv6 if you run it directly on CentOS or RedHat.

shawnweeks avatar Apr 06 '20 19:04 shawnweeks

Did you open this as an issue with WildFly?

rhatdan avatar Apr 06 '20 20:04 rhatdan

I've posted an a question in their forum but I suspect their going to ask why it's their issue since it works fine on Docker and bare metal. Docker has taken the approach to disable IPv6 inside of containers but bare metal with IPv6 enabled works fine.

shawnweeks avatar Apr 06 '20 22:04 shawnweeks

@mccv1r0 WDYT?

rhatdan avatar Apr 07 '20 12:04 rhatdan

@shawnweeks Could you give me a link on the wildfire issue, so I can watch it and participate.

rhatdan avatar Apr 07 '20 12:04 rhatdan

It looks like WildFly image needs work to support IPv6.

Docker doesn't enable IPv6 unless you enable ipv6: true in /etc/docker/daemon.json. @shawnweeks are you looking for something similar re podman?

mccv1r0 avatar Apr 07 '20 13:04 mccv1r0

FWIW, i've seen issues with other workloads failing to resolve after i enabled IPv6 on the host and my network, due to the container only having an IPv6 address nameserver (and IPv6 network access appears to be "not there yet" with podman).. not sure this is the case here but something to check inside the container the contents of /etc/resolv.conf.. the work-around i came up with is passing --dns 1.1.1.1 on the podman to override the host's (ipv6-only) nameserver

aleks-mariusz avatar May 07 '20 13:05 aleks-mariusz

@mheon PTAL

rhatdan avatar Jun 09 '20 19:06 rhatdan

I think I am facing the same IPv6 issue, while on docker I was able to overcome it by keeping ipv6: false on its config. Still, for podman I am unable to build alpine containers because alpine apk fetch gets stuck with their own servers (due to broken IPv6 inside the containers).

Using alternative DNS servers did not work for me as remote servers can also return IpV6 (and I may also need local DNS servers for some cases too).

I may worth mentioning https://stackoverflow.com/questions/30750271/disable-ip-v6-in-docker-container/41497555#41497555 which is a workaround I tested for docker and that worked for "run" (not build).

That is a very annoying issue because IPv6 works fine on the host, but its presence is causing build failures with containers.

Still, I was able to find an ugly hack for alpine --add-host dl-cdn.alpinelinux.org:1.2.3.4 to enforce an IPv4 address for it.

ssbarnea avatar Jun 16 '20 09:06 ssbarnea

I was having the same issue ... adding -e BIND=127.0.0.1 worked for me.

salifou avatar Jul 10 '20 18:07 salifou

I think it's important to note the default configs (using the ubuntu package here) are a bit conflicting.

There's no IPv6 config on a "stock" /etc/cni/net.d/87-podman-bridge.conflist yet containers are created with an ipv6 link local address for me.

So in order to avoid issues like this, if podman ships with IPv6 disabled it shouldn't be creating those IPv6s, but if it is supposed to be enabled by default then it needs to come with a config which explicitly enables IPv6 addresses and internet connectivity, so disabling IPv6 becomes a matter of removing the IPv6 lines from the config, and same goes for enabling/disabling IPv4.

yangm97 avatar Aug 12 '20 01:08 yangm97

We've gone through this before, and came to the conclusion that there is no way of enabling IPv6 by default in the config. We'd need users to provide a routable IPv6 subnet for us to use internally, since NAT for v6 is strongly frowned upon - so we can't ship a default config that works everywhere.

At the same time, we have no easy way or disabling link-local addresses. CNI doesn't really expose knobs to tweak things at that level - we can control what CNI does create, but those are made by the kernel automatically and we don't really have a way of disabling that.

mheon avatar Aug 12 '20 02:08 mheon

There's also a point to be made that many people got used to the eggshell security v4 NAT provides their containers with so globally addressable v6 containers would open up some passwordless redis instances to the internet... yeah, I feel it.

But on the other hand, as much as I hate to say it, once v4 internet comes to an end, there's no other possible default configuration which doesn't include NAT for v6, since we can't just assume a developer machine is going to receive a routable subnet and so on.

Correct me if I'm wrong, but it seems like we're all kind of postponing the inevitable.

yangm97 avatar Aug 12 '20 12:08 yangm97

Out of curiosity how is Docker disabling the IPv6 interface inside of the containers, I thought it used similar Linux features? My base issue was that podman presented an ipv6 interface inside of the container that Wildfly couldn't bind to and it sounds like podman doesn't even enable IPv6 by default so the interface shouldn't even show up.

shawnweeks avatar Aug 13 '20 12:08 shawnweeks

It's definitely a kernel feature that's available to them (because they wrote their own networking library) but not so much to us (because we're using an existing one, CNI). We could, of course, attempt to add this to CNI and contribute that change upstream, but we haven't had much luck in that area before.

mheon avatar Aug 13 '20 13:08 mheon

What about the case where there is a valid ipv6 prefix available? This would require, of course, the responsibility to include $YOUR_FAV_FIREWALL installation and configuration.

Sounds like this being precluded by policy ATM.

dithotxgh avatar Aug 22 '20 20:08 dithotxgh

@mccv1r0 PTAL

rhatdan avatar Sep 10 '20 21:09 rhatdan

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Oct 16 '20 00:10 github-actions[bot]

@mheon @baude another networking issue.

rhatdan avatar Dec 24 '20 12:12 rhatdan

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Jan 27 '21 00:01 github-actions[bot]

@baude @mheon What should we do with this one?

rhatdan avatar Jan 27 '21 13:01 rhatdan

Podman and IPv6 is a part of a larger discussion we need to have on our approach to networking going forward.

mheon avatar Jan 27 '21 15:01 mheon

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Feb 27 '21 00:02 github-actions[bot]

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar Apr 01 '21 00:04 github-actions[bot]