vagrant-hostmanager
vagrant-hostmanager copied to clipboard
Vagrant's 127.0.0.1 hostname alias overrides's hostmanager's IP
Recent versions of Vagrant set the VM name and hostname as aliases to 127.0.0.1 at the top of /etc/hosts on CentOS. This seems to change the behavior of hostname resolution performed by Name Server Switch such that 127.0.0.1 is returned instead of the IP added by vagrant-hostmanager.
Example:
In this dcos-vagrant example, there are several VMs, one of which is named a1 with the hostname a1.dcos.
$ cat /etc/hosts
127.0.0.1 a1.dcos a1
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
## vagrant-hostmanager-start
192.168.65.111 a1.dcos
192.168.65.50 boot.dcos
192.168.65.90 m1.dcos
## vagrant-hostmanager-end
Host resolution works fine for many tools, like host, nslookup and dig, but fails for tools that use NSS like ping and curl.
ping hits 127.0.0.1 instead of 192.168.65.111:
$ ping a1.dcos
PING a1.dcos (127.0.0.1) 56(84) bytes of data.
64 bytes from a1.dcos (127.0.0.1): icmp_seq=1 ttl=64 time=0.022 ms
64 bytes from a1.dcos (127.0.0.1): icmp_seq=2 ttl=64 time=0.047 ms
^C
--- a1.dcos ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1027ms
rtt min/avg/max/mdev = 0.022/0.034/0.047/0.013 ms
curl a1.dcos tries 127.0.0.1 first and only tries 192.168.65.111 if 127.0.0.1 is refused:
$ curl -v a1.dcos
* About to connect() to a1.dcos port 80 (#0)
* Trying 127.0.0.1...
* Connection refused
* Trying 192.168.65.111...
* Connection refused
* Failed connect to a1.dcos:80; Connection refused
* Closing connection 0
curl: (7) Failed connect to a1.dcos:80; Connection refused
Rearranging /etc/hosts to put hostmanager aliases on top fixes ping:
$ cat /etc/hosts
## vagrant-hostmanager-start
192.168.65.111 a1.dcos
192.168.65.50 boot.dcos
192.168.65.90 m1.dcos
## vagrant-hostmanager-end
127.0.0.1 a1.dcos a1
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
$ ping a1.dcos
PING a1.dcos (192.168.65.111) 56(84) bytes of data.
64 bytes from a1.dcos (192.168.65.111): icmp_seq=1 ttl=64 time=0.023 ms
64 bytes from a1.dcos (192.168.65.111): icmp_seq=2 ttl=64 time=0.036 ms
^C
--- a1.dcos ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1045ms
rtt min/avg/max/mdev = 0.023/0.029/0.036/0.008 ms
But curl still tried 127.0.0.1 first:
$ curl -v a1.dcos
* About to connect() to a1.dcos port 80 (#0)
* Trying 127.0.0.1...
* Connection refused
* Trying 192.168.65.111...
* Connection refused
* Failed connect to a1.dcos:80; Connection refused
* Closing connection 0
curl: (7) Failed connect to a1.dcos:80; Connection refused
The above report is using a CentOS 7.2 guest on VirtualBox.
Curl version:
$curl --version
curl 7.29.0 (x86_64-redhat-linux-gnu) libcurl/7.29.0 NSS/3.19.1 Basic ECC zlib/1.2.7 libidn/1.28 libssh2/1.4.3
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp scp sftp smtp smtps telnet tftp
Features: AsynchDNS GSS-Negotiate IDN IPv6 Largefile NTLM NTLM_WB SSL libz
Removing the Vagrant added alias results in the desired behavior:
$ curl -v a1.dcos
* About to connect() to a1.dcos port 80 (#0)
* Trying 192.168.65.111...
* Connection refused
* Failed connect to a1.dcos:80; Connection refused
* Closing connection 0
curl: (7) Failed connect to a1.dcos:80; Connection refused
FYi I've run into the same issue. I get around it by automating stripping out the offending entries with a shell provisioner. Something like this:
hostfix = 'sed "s/\\(127.0.0.1.*\\)$(hostname)\\(.*\\)/\\1\\2/" < /etc/hosts > /tmp/hosts && mv /tmp/hosts /etc/hosts'
...
node.vm.provision :shell, inline: hostfix
Yeah, I've had to use something similar. Would be nice if vagrant-hostmanager did it for us tho.
I've lost yesterday because of this issue - I couldn't figure out why rancher/docker weren't working properly!
IMO, the correct thing to do here is for vagrant to not add that line to /etc/hosts. I wonder if it can be disabled?
I've been running into the same issue. I'm using Vagrant 1.8.6 and CentOS 7.2 based vagrant boxes.
FWIW, I use a provisioning step to work around this issue:
machine_types.each do |name, machine_type|
config.vm.define name do |machine|
machine.vm.provision :shell, inline: "sed -i'' '/^127.0.0.1\\t#{machine.vm.hostname}\\t#{name}$/d' /etc/hosts"
end
end
Handy work around. Thank you.
I ran into this problem and wasted nearly a day trying to figure out why a couple of my services (e.g., Spark and ZooKeeper) failed to work. In my case, I use Vagrant with Ansible as my provisioner. Here's the equivalent workaround for Ansible. Simply, add this to one of your Ansible roles:
- name: prevent hostname from binding to the loopback address
command: sed -i '/127.0.0.1\t{{ansible_hostname}}\t{{ansible_hostname}}/d' /etc/hosts
ignore_errors: true
changed_when: true