kitchen-ec2 icon indicating copy to clipboard operation
kitchen-ec2 copied to clipboard

SSH access failing to EC2 instance through kitchen

Open saidmasoud opened this issue 6 years ago • 15 comments

OS: macOS 10.12.4

When creating an EC2 instance via kitchen, I cannot SSH into the host. I can, however, manually SSH into the host on the command line. I have looked up previously related issues in this repo, and none of the solutions listed helped me out. My ~/.ssh/config file has nothing special in it.

Software versions:

  • Chef Development Kit Version: 2.5.3
  • chef-client version: 13.8.5
  • delivery version: master (73ebb72a6c42b3d2ff5370c476be800fee7e5427)
  • berks version: 6.3.1
  • kitchen version: 1.20.0
  • inspec version: 1.51.21
  • kitchen-ec2 (2.2.1)

.kitchen.yml:

---
driver:
  name: ec2
  aws_ssh_key_id: kubernetes.dev.example.com-36:e8:0a:34:0a:e8:86:68:37:1f:52:53:1e:1b:91:bd
  security_group_ids: sg-91e8f2fa
  region: us-east-2
  availability_zone: b
  subnet_id: subnet-478d283d
  instance_type: t2.micro
  associate_public_ip: true
  interface: dns

transport:
  ssh_key: /Users/said/Downloads/dev-fluentd.pem
  username: admin

provisioner:
  name: chef_zero
  # You may wish to disable always updating cookbooks in CI or other testing environments.
  # For example:
  #   always_update_cookbooks: <%= !ENV['CI'] %>
  always_update_cookbooks: true

verifier:
  name: inspec

platforms:
  - name: debian-k8s
    driver:
      image_id: ami-a45064c1 #Default kops image
    transport:
      username: admin


suites:
  - name: default
    run_list:
      - recipe[kops-nodes::default]
    verifier:
      inspec_tests:
        - test/integration/default
    attributes:

kitchen create output:

-----> Starting Kitchen (v1.20.0)
-----> Creating <default-debian-k8s>...
       Detected platform: debian version 1 on x86_64. Instance Type: t2.micro. Default username: admin (default).
       If you are not using an account that qualifies under the AWS
free-tier, you may be charged to run these suites. The charge
should be minimal, but neither Test Kitchen nor its maintainers
are responsible for your incurred costs.

       Instance <i-0377cbb72560e6e7d> requested.
       Polling AWS for existence, attempt 0...
       Attempting to tag the instance, 0 retries
       EC2 instance <i-0377cbb72560e6e7d> created.
       Waited 0/300s for instance <i-0377cbb72560e6e7d> volumes to be ready.
       Waited 0/300s for instance <i-0377cbb72560e6e7d> to become ready.
       Waited 5/300s for instance <i-0377cbb72560e6e7d> to become ready.
       Waited 10/300s for instance <i-0377cbb72560e6e7d> to become ready.
       Waited 15/300s for instance <i-0377cbb72560e6e7d> to become ready.
       Waited 20/300s for instance <i-0377cbb72560e6e7d> to become ready.
       EC2 instance <i-0377cbb72560e6e7d> ready (hostname: ec2-18-216-126-50.us-east-2.compute.amazonaws.com).
       Waiting for SSH service on ec2-18-216-126-50.us-east-2.compute.amazonaws.com:22, retrying in 3 seconds
       Waiting for SSH service on ec2-18-216-126-50.us-east-2.compute.amazonaws.com:22, retrying in 3 seconds
       Waiting for SSH service on ec2-18-216-126-50.us-east-2.compute.amazonaws.com:22, retrying in 3 seconds
       Waiting for SSH service on ec2-18-216-126-50.us-east-2.compute.amazonaws.com:22, retrying in 3 seconds
       Waiting for SSH service on ec2-18-216-126-50.us-east-2.compute.amazonaws.com:22, retrying in 3 seconds

kitchen converge output:

-----> Starting Kitchen (v1.20.0)
-----> Creating <default-debian-k8s>...
       Finished creating <default-debian-k8s> (0m0.00s).
-----> Converging <default-debian-k8s>...
       Preparing files for transfer
       Preparing dna.json
       Resolving cookbook dependencies with Berkshelf 6.3.1...
       Removing non-cookbook files before transfer
       Preparing validation.pem
       Preparing client.rb
       [SSH] connection failed, retrying in 1 seconds (#<Net::SSH::AuthenticationFailed: Authentication failed for user [email protected]>)
       [SSH] connection failed, retrying in 1 seconds (#<Net::SSH::AuthenticationFailed: Authentication failed for user [email protected]>)
       [SSH] connection failed, retrying in 1 seconds (#<Net::SSH::AuthenticationFailed: Authentication failed for user [email protected]>)
       [SSH] connection failed, retrying in 1 seconds (#<Net::SSH::AuthenticationFailed: Authentication failed for user [email protected]>)
$$$$$$ [SSH] connection failed, terminating (#<Net::SSH::AuthenticationFailed: Authentication failed for user [email protected]>)
>>>>>> ------Exception-------
>>>>>> Class: Kitchen::ActionFailed
>>>>>> Message: 1 actions failed.
>>>>>>     Converge failed on instance <default-debian-k8s>.  Please see .kitchen/logs/default-debian-k8s.log for more details
>>>>>> ----------------------
>>>>>> Please see .kitchen/logs/kitchen.log for more details
>>>>>> Also try running `kitchen diagnose --all` for configuration

/.ssh/config:

Host *
 AddKeysToAgent yes
 UseKeychain yes
 IdentityFile ~/.ssh/id_rsa

Manual SSH attempt:

$ ssh -i ~/Downloads/dev-fluentd.pem [email protected]
The authenticity of host 'ec2-18-216-126-50.us-east-2.compute.amazonaws.com (18.216.126.50)' can't be established.
ECDSA key fingerprint is SHA256:/SiX/f6gGJATCJbpWZqVNZBzJcWTxKPMeN8ubfHSp3o.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ec2-18-216-126-50.us-east-2.compute.amazonaws.com,18.216.126.50' (ECDSA) to the list of known hosts.

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
admin@ip-172-20-39-233:~$

saidmasoud avatar May 07 '18 14:05 saidmasoud

I had a very similar issue. I was able to login to SSH by using the auto-generated key.

I removed references to

aws_ssh_key_id in driver

and

ssh_key in transport

That seemed to work for me.

dancfox avatar May 08 '18 17:05 dancfox

oh wow that actually worked, thanks @dancfox !

I still believe this is an issue that needs to be addressed, as users should be able to use an already-created SSH key if the option is available to them.

saidmasoud avatar May 08 '18 17:05 saidmasoud

The intention in current versions is that if one does not provide aws_ssh_key_id, then we will auto-generate and use one, but otherwise we still respect one being set. This could be a bug or misconfiguration but we can certainly try to repro.

cheeseplus avatar May 08 '18 17:05 cheeseplus

@cheeseplus yeah that makes sense, it seems like there may indeed be a bug when users provide an SSH key, but at least I can work with auto-gened SSH keys for now. Let me know if I can help repro the issue in any way!

saidmasoud avatar May 08 '18 17:05 saidmasoud

What kind of SSH key is it? It's possible you've got some newer algorithm that either net-ssh doesn't support in the version we use or that we haven't set things up correctly for.

coderanger avatar May 08 '18 18:05 coderanger

@coderanger is this what you're looking for? The key was generated in AWS EC2 by kops:

$ openssl rsa -text -noout -in ~/Downloads/dev-fluentd.pem
Private-Key: (2048 bit)
<REDACTED>

saidmasoud avatar May 08 '18 19:05 saidmasoud

I just ran into a similar error. It seemed that i had a orphaned instance that kitchen thought was still there. When running kitchen converge. I deleted the *.yml file that was associated with the build(.kitchen/default-.yml) and everything started working for me again..

paul1994 avatar Nov 21 '18 16:11 paul1994

Same issue as OP and dancfox's workaround (https://github.com/test-kitchen/kitchen-ec2/issues/398#issuecomment-387475145) still solves it.

Running Test Kitchen version 2.3.3

g0to avatar Oct 09 '19 16:10 g0to

I had a very similar issue. I was able to login to SSH by using the auto-generated key.

I removed references to

aws_ssh_key_id in driver

and

ssh_key in transport

That seemed to work for me.

I am experiencing the same issue with a CentOS AMI 8.2

---
driver:
  name: ec2
  aws_ssh_key_id: <%= ENV['AWS_SSH_KEYNAME'] %>
  region: us-east-1
  instance_type: <%= ENV['AWS_INSTANCE_TYPE'] %>
  spot_price:  <%= ENV['AWS_SPOT_PRICE'] %>
  associate_public_ip: true
  interface: public
  subnet_id: <%= ENV['AWS_SUBNET_ID']  %>
  security_group_ids: <%= ENV['AWS_SG_ID'] %>
  retryable_tries: 200
  shared_credentials_profile: demo
  user_data: user_data_centos_8.sh
    
provisioner:
  name: shell
  log_level: 5
  max_retries: 3
  wait_for_retry: 30
  retry_on_exit_code: # will retry if winrm is unable to connect to the ec2 instance
    - -1 #Generic error during Chef execution 
    - 1 #Generic error during Chef execution
  #script: 'bootstrap.sh'

verifier:
  name: inspec
  format: documentation
  reporter: 
    - cli
    - html:./inspec_output.html

transport:
    name: ssh
    ssh_key: ~/.ssh/<%= ENV['AWS_SSH_KEYNAME'] %>.pem
    max_wait_until_ready: 900
    connect_timeout: 60
    connection_retries: 10
    connection_retry_sleep: 10
    username: centos

platforms:
  - name: centos-8
    driver:
      image_id: <%= ENV['AWS_AMI_ID'] %>

suites:
  - name: default
    verifier:
      inspec_tests:
        - test/os_spec.rb
[SSH] opening connection to [email protected]<{:user_known_hosts_file=>"/dev/null", :port=>22, :compression=>false, :compression_level=>0, :keepalive=>true, :keepalive_interval=>60, :keepalive_maxcount=>3, :timeout=>15, :keys_only=>true, :keys=>["/Users/user1/.ssh/demo.pem"], :auth_methods=>["publickey"], :verify_host_key=>:never, :logger=>#<Logger:0x00007fc2b0b18fd0 @level=4, @progname=nil, @default_formatter=#<Logger::Formatter:0x00007fc2b0b18f80 @datetime_format=nil>, @formatter=nil, @logdev=#<Logger::LogDevice:0x00007fc2b0b18f30 @shift_period_suffix=nil, @shift_size=nil, @shift_age=nil, @filename=nil, @dev=#<IO:<STDERR>>, @mon_mutex=#<Thread::Mutex:0x00007fc2b0b18ee0>, @mon_mutex_owner_object_id=70237082404760, @mon_owner=nil, @mon_count=0>>, :password_prompt=>#<Net::SSH::Prompt:0x00007fc2b0b18eb8>, :user=>"centos"}>

I was able to launch the same ami manually and I can manually SSH into the box. So, kitchen-ec2 must be doing something wrong or there is a misconfiguration somewhere.

lmayorga1980 avatar Jan 12 '21 14:01 lmayorga1980

any news about this one?

lmayorga1980 avatar Jan 17 '21 14:01 lmayorga1980

It looks like it has been resolved in net-ssh 7.0.0+, but chef requires <7.0. Found some breadcrumbs here. To my understanding a proper solution is to bump net-ssh requirement in Chef to 7.0+. Alternatively kitchen can use the same net-ssh patch as Vagrant does: https://github.com/hashicorp/vagrant/blob/main/lib/vagrant/patches/net-ssh.rb

gp42 avatar Jun 28 '22 06:06 gp42

Ran into this as well, on our platform we're stuck with RSA keys for now. Would make a lot of sense to bump net-ssh to 7.0 instead of backporting all kind of patches.

sspans-sbp avatar Jul 19 '22 11:07 sspans-sbp

In my case, during our validation we create a temporal ssh key

aws ec2 create-key-pair --key-name tmp-kitchen --key-type ed25519 | jq -r ".KeyMaterial" > tmp-kitchen.pem
export AWS_SSH_KEYNAME=tmp-kitchen

I was able to get CentOS 8.x work with this setup.

lmayorga1980 avatar Aug 01 '22 22:08 lmayorga1980

The above works with a different environment variable for me:

export AWS_SSH_KEY_ID=tmp-kitchen

Then adding the transport config to the kitchen config:

transport:
  ssh_key: tmp-kitchen.pem

ymmv ;)

idnorton avatar Sep 29 '22 16:09 idnorton