vault-consul-on-kube icon indicating copy to clipboard operation
vault-consul-on-kube copied to clipboard

UDP ping not working "network may be misconfigured and not allowing bidirectional UDP"

Open sirhopcount opened this issue 7 years ago • 10 comments

I have successfully deployed the consul part of your demo but I keep seeing the following error in logs of the consul containers:

2017/03/29 20:12:32 [WARN] memberlist: Was able to reach <HOSTNAME> via TCP but not UDP, network may be misconfigured and not allowing bidirectional UDP

Is this expected with this configuration?

I assume serflan-udp (8301) is used for this, which is defined in both the service and the deployment.

Others who had this issue (not k8s specific but docker related) solved it by using the --net=host option but that seems more of a hack to me.

Do you have any idea as to what could cause this error or how to solve it?

sirhopcount avatar Mar 30 '17 13:03 sirhopcount

Thanks, we have this as well (but it doesn't seem to do harm). AFAICT this is a result of a docker problem with UDP, but it looks like they're moving forward on it?

There have been some reports in issues I subscribe to that kubernetes implemented a workaround for the docker problem (see kubernetes links in that issue), but I haven't yet tried upgrading our prod environment as it's stable and this annoying issue seems to do no harm. (We're running Kube 1.5.4 - I'd be interested to know what version you're running.)

This is mentioned in the README, but the solution there is not optimal - it can also be solved (I think) by just adding a new kube server and draining the problem one. I have not tried that though. Unfortunately, the problem just comes back the next time you have to take down consul.

Thanks so much for your report and we look forward to hearing more about your experience with this, and of course PRs and other issues are welcome.

rfay avatar Mar 30 '17 14:03 rfay

Hi @rfay,

Thank you for your quick response. I followed the readme but only glanced the troubleshooting document, sorry for that.

It seems to be quite an elusive bug seeing that they have been trying the fix it since 2014. The kubernetes team deployed there own (quick) fix in 1.6 (issue 32561) but upgrading to 1.6 is not an "easy" solution for us.

The warning is not that much of problem but because we log all container output to an ELK cluster it seems a bit of a waste.

You wouldn't know of a way to make consul less verbose or how to force it to only use TCP?

sirhopcount avatar Mar 31 '17 10:03 sirhopcount

I asked your question in https://github.com/hashicorp/consul/issues/2200#issuecomment-290711873 just now - you may want to chime in there or at least watch that issue. Thanks so much for your participation here and we hope to see more.

rfay avatar Mar 31 '17 13:03 rfay

If we don't see a response there in a day or two, the community on the consul mailing list may be a help.

rfay avatar Mar 31 '17 13:03 rfay

Docker claims to have solved this in https://github.com/docker/docker/pull/32505 - hopefully that will work its way through the system before too long.

rfay avatar Apr 11 '17 15:04 rfay

Thats good to hear, I will close the issue as the solution is known. Thanks for all the help :smile:

sirhopcount avatar Apr 13 '17 13:04 sirhopcount

I'd like to leave this open for others until the docker fix makes it into a release and is easily available.

rfay avatar Apr 13 '17 14:04 rfay

Just a note that I upgraded our staging environment to Kubernetes 1.6.2 today and did not see any improvement with this problem, so it looks like it will remain an annoyance at least for the short term.

rfay avatar Apr 30 '17 19:04 rfay

I note in https://github.com/moby/moby/issues/8795 that docker 17.05-ce (currently at rc1) should have the fix in docker. I haven't found where the docker dependency is managed for kubernetes nodes, but will be watching for the upgrade to come through.

rfay avatar May 08 '17 16:05 rfay

Tagging this w/hibernate until we have a clear direction/solution from Docker.

rickmanelius avatar Jun 20 '17 14:06 rickmanelius