minikube icon indicating copy to clipboard operation
minikube copied to clipboard

Minikube to host communication not working on Fedora 37

Open mnk opened this issue 2 years ago • 2 comments

What Happened?

There seems to be a difference in minikube iptables rules when comparing a fully updated Fedora 36 and Fedora 37 system. On Fedora 36:

$ sudo iptables -t nat -S|grep -e '--to-destination 127.0.0.11'
-A DOCKER_OUTPUT -d 192.168.49.1/32 -p tcp -m tcp --dport 53 -j DNAT --to-destination 127.0.0.11:39397
-A DOCKER_OUTPUT -d 192.168.49.1/32 -p udp -m udp --dport 53 -j DNAT --to-destination 127.0.0.11:34196

On Fedora 37:

$ sudo iptables -t nat -S|grep -e '--to-destination 127.0.0.11'
-A DOCKER_OUTPUT -d 192.168.49.1/32 -p tcp -j DNAT --to-destination 127.0.0.11:46739
-A DOCKER_OUTPUT -d 192.168.49.1/32 -p udp -j DNAT --to-destination 127.0.0.11:37392

The missing --dport 53 condition on the destination NAT breaks all non-DNS communication between host and minikube. What might be causing this difference?

Attach the log file

log.txt

Operating System

Redhat/Fedora

Driver

Docker

mnk avatar Jan 02 '23 15:01 mnk

Full output of minikube ssh sudo iptables-save: iptables-save-f37.txt iptables-save-f36.txt

mnk avatar Jan 04 '23 22:01 mnk

hmm, do we know if was docker the one to drop the port?

I've seen a recent report in kubernetes about some weird iptables rules being mutated in Centos

aojea avatar Jan 04 '23 23:01 aojea

@mnk thanks for reporting this issue

there are a couple of improvements i'm currently working on in draft pr #15463

could you please try with https://storage.googleapis.com/minikube-builds/15463/minikube-linux-amd64 and let us know if it works for you

if not, could you pull that pr and run:

make integration -e TEST_ARGS="-minikube-start-args='--driver=docker --container-runtime=docker --alsologtostderr -v=7' -test.run TestNetworkPlugins --cleanup=true"

then share the whole output you get

i've tried to replicate your setup (ie, fresh fedora 37 install [in kvm] + docker) and all the tests above passed for me, so i'm curious to know if it would work for you

prezha avatar Jan 05 '23 20:01 prezha

@prezha , I tried the minikube build you linked to, but I still get the same result - no --dport 53 condition on the DNAT rule. Do I need to specify a base image or just do a minikube-linux-amd64 start?

Do you get the --dport 53 condition when testing with you branch?

My use-case can be tested by starting sshd on the host and then doing minikube ssh ssh $(id -un)@192.168.49.1. This works fine on Fedora 36 but not on Fedora 37.

mnk avatar Jan 06 '23 02:01 mnk

@aojea as mentioned on the PR I also see no dport currently, on gLinux (~debian):

$ docker run -d --entrypoint=sleep --network=kind --privileged --name=aaa kindest/node:v1.25.3 infinity
07cc1460e1bf62a936e33775efdda0fbce577634eb06b07dcdd267bd855f9248

$ docker exec --privileged aaa iptables-save
# Generated by iptables-save v1.8.7 on Wed Jan  4 22:07:38 2023
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:DOCKER_OUTPUT - [0:0]
:DOCKER_POSTROUTING - [0:0]
-A OUTPUT -d 127.0.0.11/32 -j DOCKER_OUTPUT
-A POSTROUTING -d 127.0.0.11/32 -j DOCKER_POSTROUTING
-A DOCKER_OUTPUT -d 127.0.0.11/32 -p tcp -j DNAT --to-destination 127.0.0.11:44029
-A DOCKER_OUTPUT -d 127.0.0.11/32 -p udp -j DNAT --to-destination 127.0.0.11:33690
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p tcp -j SNAT --to-source :53
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p udp -j SNAT --to-source :53
COMMIT

(NOTE: this skips the entrypoint logic etc, the point is to debug purely what docker is doing on its own with the embedded DNS rules)

BenTheElder avatar Jan 06 '23 20:01 BenTheElder

Docker might have actually dropped the dport themselves at some point, given that they probably don't expect other traffic on 127.0.0.11 typically, but in that case I'd argue this is a bug on their end and we should fix it there.

I did a bit of digging and haven't turned up anything though, and in https://github.com/kubernetes/minikube/pull/15578#discussion_r1061944416 it appears that docker, containerd are identical but iptables versions are different.

BenTheElder avatar Jan 06 '23 20:01 BenTheElder

@mnk i was wrong - haven't read your initial problem statement carefully, so was jumping to a conclusion; sorry about that!

i've looked at it a bit and i think that the problem is in using iptables-nft (which i think is default for fedora37) instead of iptables-legacy (i also remember reading in one of the kubernetes issues recently that nft is not [yet] supported, but i don't have a reference at hand atm)

i suggest you try with iptables-legacy instead - here's what i did:

$ sudo dnf install iptables-legacy

$ sudo update-alternatives --config iptables => select '/usr/sbin/iptables-legacy'

$ iptables --version
iptables v1.8.8 (legacy)

$ sudo reboot

$ minikube start
...

$ minikube ssh -- sudo iptables-save
# Generated by iptables-save v1.8.4 on Fri Jan  6 22:37:48 2023
*nat
...
-A DOCKER_OUTPUT -d 192.168.49.1/32 -p tcp -m tcp --dport 53 -j DNAT --to-destination 127.0.0.11:41653
-A DOCKER_OUTPUT -d 192.168.49.1/32 -p udp -m udp --dport 53 -j DNAT --to-destination 127.0.0.11:53598
...

$ minikube ssh ssh $(id -un)@192.168.49.1
The authenticity of host '192.168.49.1 (192.168.49.1)' can't be established.
ECDSA key fingerprint is SHA256:Y8jQ23KJ8Oy+H2e9eXDpttirqcg42g7HVtg4ZFjVgHM.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.49.1' (ECDSA) to the list of known hosts.
[email protected]'s password: 
Web console: https://localhost:9090/ or https://192.168.122.198:9090/

Last login: Fri Jan  6 22:35:54 2023 from 192.168.122.1

prezha avatar Jan 06 '23 22:01 prezha

btw, i've also compared docker versions that are installed on a fresh ubuntu 20.04.5 (where iptables-legacy is default and this is working) and fedora 37 (where iptables-nft are default and apparently is not working), and they use identical versions/commits:

ubuntu 20.04.5:

prezha@minikube-test:~$ docker version
Client: Docker Engine - Community
 Version:           20.10.22
 API version:       1.41
 Go version:        go1.18.9
 Git commit:        3a2c30b
 Built:             Thu Dec 15 22:28:08 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.22
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.18.9
  Git commit:       42c8b31
  Built:            Thu Dec 15 22:25:58 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.14
  GitCommit:        9ba4b250366a5ddde94bb7c9d1def331423aa323
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

fedora37:

[prezha@localhost ~]$ docker version
Client: Docker Engine - Community
 Version:           20.10.22
 API version:       1.41
 Go version:        go1.18.9
 Git commit:        3a2c30b
 Built:             Thu Dec 15 22:28:45 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.22
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.18.9
  Git commit:       42c8b31
  Built:            Thu Dec 15 22:26:25 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.14
  GitCommit:        9ba4b250366a5ddde94bb7c9d1def331423aa323
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

prezha avatar Jan 06 '23 22:01 prezha

looks like there were some efforts to support automatic iptables "mode" detection, but it seems that it's not working correctly in this case and this comment from Tim Hockin is a bit old but perhaps still relevant

prezha avatar Jan 06 '23 23:01 prezha

looks like minikube user(s) reported identical problem earlier: https://github.com/kubernetes/minikube/issues/14631#issuecomment-1344987605 whereas original problem refers to iptables-nft

so, if this switch to iptables-legacy is a working solution for @mnk, perhaps we add detection in minikube so if we see iptables-nft, we warn the user that "things might not work as expected"

prezha avatar Jan 06 '23 23:01 prezha

to https://github.com/kubernetes-sigs/iptables-wrappers/pull/3, but it seems that it's not working correctly in this case and this comment from Tim Hockin is a bit old but perhaps still relevant

That's not quite related. That change in Kubernetes is related to automatic mode detection in the kube-proxy image, in a "normal" environment by looking for rules setup by kubelet on the host. In our case we instead detect based on docker injecting rules for the embedded DNS resolver using the host iptables legacy or nf_tables.

That tweet predates the detection logic entirely, which is itself a workaround for the problem of there not being a stable interface and distros switching between the two binaries / backends.

The upstream KIND entrypoint has the same detection logic as kube-proxy (prior to the trick looking at kubelet generated rules specifically).

My host really is using nf_tables 1.8.8 and I don't see --dport using either in the "node". So the issue is not mismatched nf_tables vs legacy, the problem is 1.8.7 vs 1.8.8 nf_tables.

And now that I've typed that, https://github.com/kubernetes/kubernetes/issues/112477 is paging back into memory 🙃

$ minikube ssh -- sudo iptables-save
# Generated by iptables-save v1.8.4 on Fri Jan  6 22:37:48 2023

1.8.4 is really old, so that's a different problem for minikube specifically.

In both cases, 1.8.8 on the host is currently a problem. kube-proxy in Kubernetes 1.26 is also on 1.8.7 so updating to 1.8.8 in kind/minikube is probably not sufficient yet.

Downgrading to 1.8.7 on the host is one workaround. Switching to legacy mode is another. Both work around the mismatched versions rather than legacy vs nf_tables detection.

BenTheElder avatar Jan 07 '23 00:01 BenTheElder

This is also just one incompatibility between 1.8.8 and 1.8.7, we're going to have more problems when we do upgrade kubernetes/kind/... to > 1.8.7 (see above issue with --mark)

BenTheElder avatar Jan 07 '23 00:01 BenTheElder

thanks for sharing the details @BenTheElder ! that's an interesting conversation between Dan (kubernetes) and Phil (netfilter) so, iptables-nft v1.8.8 introduced a breaking change, and there are no plans to "fix" that, and the workaround atm is to:

  • stick with v1.8.7 (that's also used in kube-proxy) - both nft and legacy mode should work, or
  • use iptables (even v1.8.8) in legacy mode (that some linux distros still keep as default)

looks like the full transition from legacy to nft is going to be fun and not so quick

prezha avatar Jan 07 '23 14:01 prezha

Yes, as @BenTheElder mentions, this indeed seems to be another case of incompatibility between 1.8.8 and 1.8.7. Without involving minikube, the problem can bee seen by:

$ docker network create --driver bridge test-net
$ docker run -it --privileged --network test-net fedora:37 bash
$ dnf install iptables-nft nftables
$ iptables-nft-save 
# Generated by iptables-nft-save v1.8.8 (nf_tables) on Sat Jan  7 17:44:56 2023
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:DOCKER_OUTPUT - [0:0]
:DOCKER_POSTROUTING - [0:0]
-A OUTPUT -d 127.0.0.11/32 -j DOCKER_OUTPUT
-A POSTROUTING -d 127.0.0.11/32 -j DOCKER_POSTROUTING
-A DOCKER_OUTPUT -d 127.0.0.11/32 -p tcp -m tcp --dport 53 -j DNAT --to-destination 127.0.0.11:41759
-A DOCKER_OUTPUT -d 127.0.0.11/32 -p udp -m udp --dport 53 -j DNAT --to-destination 127.0.0.11:43231
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p tcp -m tcp --sport 41759 -j SNAT --to-source :53
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p udp -m udp --sport 43231 -j SNAT --to-source :53
COMMIT
$ exit
$ docker run -it --privileged --network test-net fedora:36 bash
$ dnf install iptables-nft nftables
$ iptables-nft-save 
# Generated by iptables-nft-save v1.8.7 on Sat Jan  7 18:31:05 2023
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:DOCKER_OUTPUT - [0:0]
:DOCKER_POSTROUTING - [0:0]
-A OUTPUT -d 127.0.0.11/32 -j DOCKER_OUTPUT
-A POSTROUTING -d 127.0.0.11/32 -j DOCKER_POSTROUTING
-A DOCKER_OUTPUT -d 127.0.0.11/32 -p tcp -j DNAT --to-destination 127.0.0.11:38683
-A DOCKER_OUTPUT -d 127.0.0.11/32 -p udp -j DNAT --to-destination 127.0.0.11:57275
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p tcp -j SNAT --to-source :53
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p udp -j SNAT --to-source :53
COMMIT

Both 1.8.8 and 1.8.7 does however produce similar output from nft list ruleset:

table ip nat {
	chain DOCKER_OUTPUT {
		ip daddr 127.0.0.11 tcp dport 53 counter packets 0 bytes 0 dnat to 127.0.0.11:38683
		ip daddr 127.0.0.11 udp dport 53 counter packets 35 bytes 2730 dnat to 127.0.0.11:57275
	}
...

I guess that means that the docker rules are fine until minikube starts patching them with iptables-save | iptables-restore?

Would it be possible for minikube/kind to just remove the docker rules and then create the needed rules from scratch?

mnk avatar Jan 08 '23 05:01 mnk

This works fine (on my VM) 😄 https://github.com/kubernetes-sigs/kind/pull/3059

aojea avatar Jan 08 '23 12:01 aojea

I guess that means that the docker rules are fine until minikube starts patching them with iptables-save | iptables-restore? Would it be possible for minikube/kind to just remove the docker rules and then create the needed rules from scratch?

So KIND doesn't use save+restore itself, but kube-proxy in Kubernetes works this way for good reasons (reconciling a large set of rules) and there's a third version of iptables in the kube-proxy image. I suspect the same for minikube.

On your host with minikube it's 1.8.8 nf_tables, 1.8.4 ?, and then 1.8.7 ? (kube-proxy in 1.25). In KIND it will be 1.8.8 nf_tables (host), 1.8.7 nf_tables (node), 1.8.7 nf_tables (kube-proxy). There's actually a fourth for CNI ... but that's generally kube-proxy matching more or less.

Discussing with @aojea and https://github.com/kubernetes-sigs/kind/pull/3059 how we might work around this.

One thought is multi-version selecting the binaries and attempting to detect what the host is using.

Another is the workaround in https://github.com/kubernetes-sigs/kind/pull/3059 combined with making sure at least kube-proxy + kindnetd + kind node match. Which is a trickier proposition for additional CNIs minikube may support.

BenTheElder avatar Jan 09 '23 19:01 BenTheElder

So KIND doesn't use save+restore itself, but kube-proxy in Kubernetes works this way for good reasons (reconciling a large set of rules) and there's a third version of iptables in the kube-proxy image. I suspect the same for minikube.

Closing the loop: that's completely backwards :-)

kind does iptables-save | sed | iptables-restore to modify the docker dns rules, and that part is included in the kicbase image.

kube-proxy avoids save | mutate | restore

BenTheElder avatar Apr 07 '23 16:04 BenTheElder

https://github.com/kubernetes-sigs/kind/issues/3054 tracks the KIND fix which wound up compromising on ~https://github.com/kubernetes/minikube/pull/15578

BenTheElder avatar Apr 07 '23 16:04 BenTheElder

https://github.com/kubernetes/minikube/issues/14631#issuecomment-1292420728 this one may help

linux019 avatar Jun 05 '23 15:06 linux019

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 21 '24 22:01 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Feb 20 '24 22:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Mar 21 '24 23:03 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Mar 21 '24 23:03 k8s-ci-robot