I get this after ansible starts running.
PLAY [all] ********************************************************************
GATHERING FACTS ***************************************************************
ESTABLISH CONNECTION FOR USER: vagrant
ESTABLISH CONNECTION FOR USER: vagrant
REMOTE_MODULE setup
REMOTE_MODULE setup
EXEC ['ssh', '-C', '-tt', '-vvv', '-o', 'UserKnownHostsFile=/dev/null', '-o', 'ControlMaster=auto', '-o', 'ControlPersist=60s', '-o', 'ControlPath=/Users/hsunami/.ansible/cp/ansible-ssh-%h-%p-%r', '-o', 'StrictHostKeyChecking=no', '-o', 'IdentityFile="/Users/hsunami/dev/Foundry-vagrant-mesos-kafka-cluster/nonHA/.vagrant/machines/mesos-slave/virtualbox/private_key"', '-o', 'KbdInteractiveAuthentication=no', '-o', 'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey', '-o', 'PasswordAuthentication=no', '-o', 'User=vagrant', '-o', 'ConnectTimeout=10', 'mesos-master', "/bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1423134787.37-127624957695043 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1423134787.37-127624957695043 && echo $HOME/.ansible/tmp/ansible-tmp-1423134787.37-127624957695043'"]
EXEC ['ssh', '-C', '-tt', '-vvv', '-o', 'UserKnownHostsFile=/dev/null', '-o', 'ControlMaster=auto', '-o', 'ControlPersist=60s', '-o', 'ControlPath=/Users/hsunami/.ansible/cp/ansible-ssh-%h-%p-%r', '-o', 'StrictHostKeyChecking=no', '-o', 'IdentityFile="/Users/hsunami/dev/Foundry-vagrant-mesos-kafka-cluster/nonHA/.vagrant/machines/mesos-slave/virtualbox/private_key"', '-o', 'KbdInteractiveAuthentication=no', '-o', 'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey', '-o', 'PasswordAuthentication=no', '-o', 'User=vagrant', '-o', 'ConnectTimeout=10', 'mesos-slave', "/bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1423134787.37-187272507066113 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1423134787.37-187272507066113 && echo $HOME/.ansible/tmp/ansible-tmp-1423134787.37-187272507066113'"]
fatal: [mesos-slave] => SSH encountered an unknown error. The output was:
OpenSSH_6.2p2, OSSLShim 0.9.8r 8 Dec 2011
debug1: Reading configuration data /Users/hsunami/.ssh/config
debug1: Reading configuration data /etc/ssh_config
debug1: /etc/ssh_config line 20: Applying options for *
debug1: /etc/ssh_config line 102: Applying options for *
debug1: auto-mux: Trying existing master
debug1: Control socket "/Users/hsunami/.ansible/cp/ansible-ssh-mesos-slave-22-vagrant" does not exist
debug2: ssh_connect: needpriv 0
debug1: Connecting to mesos-slave [100.0.10.101] port 22.
debug2: fd 3 setting O_NONBLOCK
debug1: fd 3 clearing O_NONBLOCK
debug1: Connection established.
debug3: timeout: 9919 ms remain after connect
debug3: Incorrect RSA1 identifier
debug3: Could not load "/Users/hsunami/dev/Foundry-vagrant-mesos-kafka-cluster/nonHA/.vagrant/machines/mesos-slave/virtualbox/private_key" as a RSA1 public key
debug1: identity file /Users/hsunami/dev/Foundry-vagrant-mesos-kafka-cluster/nonHA/.vagrant/machines/mesos-slave/virtualbox/private_key type -1
debug1: identity file /Users/hsunami/dev/Foundry-vagrant-mesos-kafka-cluster/nonHA/.vagrant/machines/mesos-slave/virtualbox/private_key-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_6.2
ssh_exchange_identification: read: Connection reset by peer
fatal: [mesos-master] => SSH encountered an unknown error. The output was:
OpenSSH_6.2p2, OSSLShim 0.9.8r 8 Dec 2011
debug1: Reading configuration data /Users/hsunami/.ssh/config
debug1: Reading configuration data /etc/ssh_config
debug1: /etc/ssh_config line 20: Applying options for *
debug1: /etc/ssh_config line 102: Applying options for *
debug1: auto-mux: Trying existing master
debug1: Control socket "/Users/hsunami/.ansible/cp/ansible-ssh-mesos-master-22-vagrant" does not exist
debug2: ssh_connect: needpriv 0
debug1: Connecting to mesos-master [100.0.10.11] port 22.
debug2: fd 3 setting O_NONBLOCK
debug1: connect to address 100.0.10.11 port 22: Operation timed out
ssh: connect to host mesos-master port 22: Operation timed out
TASK: [common | create mesosphere repo] ***************************************
FATAL: no hosts matched or all hosts have already failed -- aborting
PLAY RECAP ********************************************************************
to retry, use: --limit @/Users/hsunami/cluster.retry
mesos-master : ok=0 changed=0 unreachable=1 failed=0
mesos-slave : ok=0 changed=0 unreachable=1 failed=0
It has to do with ansible picking the wrong key to provision the mesos-master instance.
If you add
ANSIBLE_RAW_SSH_ARGS = []
cluster.each_with_index do |(hostname, info), index|
ANSIBLE_RAW_SSH_ARGS << "-o IdentityFile=.vagrant/machines/#{hostname}/virtualbox/private_key"
end
just above cluster.each_with_index do |(hostname, info), index|
in the Vagrantfile
and ansible.raw_ssh_args = ANSIBLE_RAW_SSH_ARGS
to the provision loop, everything should be fixed.
Might be due to Vagrants ssh key recycling (default in newer versions).
Setting
config.ssh.insert_key = false
in the Vagrantfile should fix it.
see https://github.com/mitchellh/vagrant/issues/5059