Foundry-vagrant-mesos-kafka-cluster icon indicating copy to clipboard operation
Foundry-vagrant-mesos-kafka-cluster copied to clipboard

error during ansible install

Open hsunami opened this issue 10 years ago • 2 comments

I get this after ansible starts running.

PLAY [all] ********************************************************************

GATHERING FACTS *************************************************************** ESTABLISH CONNECTION FOR USER: vagrant ESTABLISH CONNECTION FOR USER: vagrant REMOTE_MODULE setup REMOTE_MODULE setup EXEC ['ssh', '-C', '-tt', '-vvv', '-o', 'UserKnownHostsFile=/dev/null', '-o', 'ControlMaster=auto', '-o', 'ControlPersist=60s', '-o', 'ControlPath=/Users/hsunami/.ansible/cp/ansible-ssh-%h-%p-%r', '-o', 'StrictHostKeyChecking=no', '-o', 'IdentityFile="/Users/hsunami/dev/Foundry-vagrant-mesos-kafka-cluster/nonHA/.vagrant/machines/mesos-slave/virtualbox/private_key"', '-o', 'KbdInteractiveAuthentication=no', '-o', 'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey', '-o', 'PasswordAuthentication=no', '-o', 'User=vagrant', '-o', 'ConnectTimeout=10', 'mesos-master', "/bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1423134787.37-127624957695043 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1423134787.37-127624957695043 && echo $HOME/.ansible/tmp/ansible-tmp-1423134787.37-127624957695043'"] EXEC ['ssh', '-C', '-tt', '-vvv', '-o', 'UserKnownHostsFile=/dev/null', '-o', 'ControlMaster=auto', '-o', 'ControlPersist=60s', '-o', 'ControlPath=/Users/hsunami/.ansible/cp/ansible-ssh-%h-%p-%r', '-o', 'StrictHostKeyChecking=no', '-o', 'IdentityFile="/Users/hsunami/dev/Foundry-vagrant-mesos-kafka-cluster/nonHA/.vagrant/machines/mesos-slave/virtualbox/private_key"', '-o', 'KbdInteractiveAuthentication=no', '-o', 'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey', '-o', 'PasswordAuthentication=no', '-o', 'User=vagrant', '-o', 'ConnectTimeout=10', 'mesos-slave', "/bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1423134787.37-187272507066113 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1423134787.37-187272507066113 && echo $HOME/.ansible/tmp/ansible-tmp-1423134787.37-187272507066113'"] fatal: [mesos-slave] => SSH encountered an unknown error. The output was: OpenSSH_6.2p2, OSSLShim 0.9.8r 8 Dec 2011 debug1: Reading configuration data /Users/hsunami/.ssh/config debug1: Reading configuration data /etc/ssh_config debug1: /etc/ssh_config line 20: Applying options for * debug1: /etc/ssh_config line 102: Applying options for * debug1: auto-mux: Trying existing master debug1: Control socket "/Users/hsunami/.ansible/cp/ansible-ssh-mesos-slave-22-vagrant" does not exist debug2: ssh_connect: needpriv 0 debug1: Connecting to mesos-slave [100.0.10.101] port 22. debug2: fd 3 setting O_NONBLOCK debug1: fd 3 clearing O_NONBLOCK debug1: Connection established. debug3: timeout: 9919 ms remain after connect debug3: Incorrect RSA1 identifier debug3: Could not load "/Users/hsunami/dev/Foundry-vagrant-mesos-kafka-cluster/nonHA/.vagrant/machines/mesos-slave/virtualbox/private_key" as a RSA1 public key debug1: identity file /Users/hsunami/dev/Foundry-vagrant-mesos-kafka-cluster/nonHA/.vagrant/machines/mesos-slave/virtualbox/private_key type -1 debug1: identity file /Users/hsunami/dev/Foundry-vagrant-mesos-kafka-cluster/nonHA/.vagrant/machines/mesos-slave/virtualbox/private_key-cert type -1 debug1: Enabling compatibility mode for protocol 2.0 debug1: Local version string SSH-2.0-OpenSSH_6.2 ssh_exchange_identification: read: Connection reset by peer

fatal: [mesos-master] => SSH encountered an unknown error. The output was: OpenSSH_6.2p2, OSSLShim 0.9.8r 8 Dec 2011 debug1: Reading configuration data /Users/hsunami/.ssh/config debug1: Reading configuration data /etc/ssh_config debug1: /etc/ssh_config line 20: Applying options for * debug1: /etc/ssh_config line 102: Applying options for * debug1: auto-mux: Trying existing master debug1: Control socket "/Users/hsunami/.ansible/cp/ansible-ssh-mesos-master-22-vagrant" does not exist debug2: ssh_connect: needpriv 0 debug1: Connecting to mesos-master [100.0.10.11] port 22. debug2: fd 3 setting O_NONBLOCK debug1: connect to address 100.0.10.11 port 22: Operation timed out ssh: connect to host mesos-master port 22: Operation timed out

TASK: [common | create mesosphere repo] *************************************** FATAL: no hosts matched or all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** to retry, use: --limit @/Users/hsunami/cluster.retry

mesos-master : ok=0 changed=0 unreachable=1 failed=0 mesos-slave : ok=0 changed=0 unreachable=1 failed=0

hsunami avatar Feb 05 '15 11:02 hsunami

It has to do with ansible picking the wrong key to provision the mesos-master instance. If you add

ANSIBLE_RAW_SSH_ARGS = []

  cluster.each_with_index do |(hostname, info), index|
    ANSIBLE_RAW_SSH_ARGS << "-o IdentityFile=.vagrant/machines/#{hostname}/virtualbox/private_key"
  end

just above cluster.each_with_index do |(hostname, info), index| in the Vagrantfile

and ansible.raw_ssh_args = ANSIBLE_RAW_SSH_ARGS to the provision loop, everything should be fixed.

bouke-nederstigt avatar Oct 09 '15 11:10 bouke-nederstigt

Might be due to Vagrants ssh key recycling (default in newer versions). Setting

config.ssh.insert_key = false

in the Vagrantfile should fix it. see https://github.com/mitchellh/vagrant/issues/5059

rasputnik avatar Oct 09 '15 12:10 rasputnik