`aws ec2-instance connect ssh` fails on Windows
Describe the bug
On Windows, the following command
aws ec2-instance-connect ssh --os-user XXXX --instance-id i-XXXXXXXXX
fails because of the (temporary) key file permissions.
Regression Issue
- [X] Select this option if this issue appears to be a regression.
Expected Behavior
Connect to the remote instance.
Current Behavior
The complete error (with the user and instance id anonymized) is:
Bad permissions. Try removing permissions for user: \\OWNER RIGHTS (S-1-3-4) on file C:/Users/XXXX/AppData/Local/Temp/tmp3cja4v_s/private-key.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: UNPROTECTED PRIVATE KEY FILE! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Permissions for 'C:\\Users\\XXXX\\AppData\\Local\\Temp\\tmp3cja4v_s\\private-key' are too open.
It is required that your private key files are NOT accessible by others.
This private key will be ignored.
Load key "C:\\Users\\XXXX\\AppData\\Local\\Temp\\tmp3cja4v_s\\private-key": bad permissions
Note that, after the command fails, the key file (private-key) and its folder (C:\Users\XXXX\AppData\Local\Temp\tmp3cja4v_s) are missing, I presume they are created "on the fly" by the command and then removed, so there is no (easy) way to examine the file and its permission.
Reproduction Steps
- Create a VPC, with public and private subnets
- Create a EC2 Instance Connect Endpoint
- Create EC2 instance on the private subnet.
- Get the EC2 instance ID
- Issue the following command:
aws ec2-instance-connect ssh --os-user XXXX --instance-id i-XXXXXXXXX
Possible Solution
I suspect that the temporary private key created for the connection does not have the correct permissions. The solution is to downgrade to a previous version of the CLI that does not present the regression.
Additional Information/Context
This appears to be a regression, version up to 2.17.0 work as expected, while versions 2.17.65, 2.18.0, 2.20.0 and 2.22.0 fail with the above error. I did not check all the minor 2.17 versions, but apparently the regression appeared between 2.17.0 and 2.17.65.
CLI version used
aws-cli/2.17.65 Python/3.12.6 Windows/11 exe/AMD64
Environment details (OS name and version, etc.)
Windows 11 Pro, version 23H2 - OS Build 22631.4460
Hi @fabiomoratti, thanks for reaching out. I wasn't able to reproduce the behavior you've described on CLI version 2.17.35. Could you provide full debug logs? You can get debug logs by adding --debug to your command, and redacting any sensitive information. Thanks!
Hello @RyanFitzSimmonsAK I confirm that version 2.17.35 is working as expected so I tried all 2.17.x version to see where the bug emerged (that is 18 versions..., I hope the effort is appreciated...):
- versions form 2.17.35 up to 2.17.51 work as expected
- version 2.17.52 fails with the error reported above.
As requested find below the output of the aws ec2-instance-connect ssh --debug --os-user XXXX --instance-id i-XXXXXXXXX command with the --debug option turned on.
I removed or redacted all possibly sensitive information and homogenized dates and other request-specific data so you can easily diff the two files to see where the command fails.
My guess is that between version 2.17.51 and version 2.17.52 the code to generate the temp key has changed and somehow does not sets the correct permission of the newly created temp key file.
I also tried to find the code where the log "Generated temporary key file:" (line 53 in the attached file) is printed to inspect the coded there but I failed, maybe I was looking in the wrong place.
Thank you for the kind assistance.
ec2-instance-connect-out--2.17.55 (success).txt ec2-instance-connect-out-2.17.51 (fail).txt
hi, i have the same behavior with the temp pem file that is too open, im using 2.22.14. if i downgrade to 2.17.35 its working well.
@ WARNING: UNPROTECTED PRIVATE KEY FILE! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Permissions for 'C:\Users\thien\AppData\Local\Temp\tmpssmpv6wf\private-key' are too open.
It is required that your private key files are NOT accessible by others.
This private key will be ignored.
Load key "C:\Users\thien\AppData\Local\Temp\tmpssmpv6wf\private-key": bad permissions
[email protected]: Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
PS C:\Users\thien\OneDrive\AWS\E6K> msiexec.exe /i https://awscli.amazonaws.com/AWSCLIV2-2.17.35.msi
PS C:\Users\thien\OneDrive\AWS\E6K> msiexec.exe /i https://awscli.amazonaws.com/AWSCLIV2-2.17.35.msi
PS C:\Users\thien\OneDrive\AWS\E6K> aws ec2-instance-connect ssh --instance-id i-064d8a39ad5c019ba
, #_
~_ ####_ Amazon Linux 2023
~~ _#####
~~ ###|
~~ #/ ___ https://aws.amazon.com/linux/amazon-linux-2023
~~ V~' '->
~~~ /
~~._. _/
_/ _/
_/m/'
Last login: Wed Dec 11 19:17:01 2024 from 10.0.137.130
Hey, thanks for following up. I was able to reproduce this behavior. While we look into this, you can specify your private key as a workaround. In my testing, using aws ec2-instance-connect ssh --instance-id i-xxx --private-key-file mykey.pem worked successfully, while omitting it failed.
indeed its what i use too.
Thanks for raising this issue with us. The root cause is that the generated key file inherits permissions from the directory created by Python's tempfile.TemporaryDirectory, which recently changed. I opened an issue with CPython to track this: https://github.com/python/cpython/issues/128038. In the meantime, we recommend using the workarounds suggested by @RyanFitzSimmonsAK.
I'm trying to take a look at how we might go about fixing this issue within the aws-cli (in the case that changes on the CPython and OpenSSH side don't come), and I'm having trouble reproducing the issue now. I added some print debug logs in anticipation of hitting the bug, but I'm successfully connecting to my instance from a Windows machine.
Administrator@EC2AMAZ-SA644HT MINGW64 ~/workplace/aws-cli (v2)
$ aws --version
aws-cli/2.22.30 Python/3.11.0b4 Windows/10 exec-env/EC2 source/AMD64
Administrator@EC2AMAZ-SA644HT MINGW64 ~/workplace/aws-cli (v2)
$ aws ec2-instance-connect ssh --instance-id i-08bfef89a0a1c928b --region us-east-1
Created temporary directory: %s C:\Users\ADMINI~1\AppData\Local\Temp\3\tmptmry7pqg
Permissions on temporary directory: %s 0o40777
Generated temporary key file: %s C:\Users\ADMINI~1\AppData\Local\Temp\3\tmptmry7pqg\private-key
Temp key file has permissions: %s 0o100666
keyfile has updated permissions after chmod: %s 0o100444
, #_
~\_ ####_ Amazon Linux 2023
~~ \_#####\
~~ \###|
~~ \#/ ___ https://aws.amazon.com/linux/amazon-linux-2023
~~ V~' '->
~~~ /
~~._. _/
_/ _/
_/m/'
Last login: Wed Jan 8 17:23:46 2025 from 54.147.35.35
[ec2-user@ip-172-31-39-54 ~]$
I'm definitely not experienced using Windows, so my test environment (had to spin up a Windows EC2 instance, connect using Fleet Manager RDP, and set up local aws-cli and connect using aws ec2-instance-connect ssh to my target instance from there) might be too permissive which might prevent me from reproducing the bug.
Are other folks able to continue reproducing this issue?
EDIT: Not able to build the aws cli locally and reproduce the issue, but I can still get the issue when I connect to my Windows test environment using SSM Sessions Manager (Windows system permissions instead of Admin permissions) which allows me to reproduce the bug.
PS C:\Windows\system32> aws --version
aws-cli/2.22.30 Python/3.12.6 Windows/2022Server exec-env/EC2 exe/AMD64
PS C:\Windows\system32> aws ec2-instance-connect ssh --instance-id i-08bfef89a0a1c928b --region us-east-1
The authenticity of host '54.210.3.128 (54.210.3.128)' can't be established.
ED25519 key fingerprint is SHA256:tW6P5b/KbSUvDOHvY+75+VIiAJuS5ddLtQva33/x7Eo.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '54.210.3.128' (ED25519) to the list of known hosts.
Bad permissions. Try removing permissions for user: \\OWNER RIGHTS (S-1-3-4) on file C:/Windows/TEMP/tmpg2dtrb_z/private-key.
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: UNPROTECTED PRIVATE KEY FILE! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Permissions for 'C:\\Windows\\TEMP\\tmpg2dtrb_z\\private-key' are too open.
It is required that your private key files are NOT accessible by others.
This private key will be ignored.
Load key "C:\\Windows\\TEMP\\tmpg2dtrb_z\\private-key": bad permissions
[email protected]: Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
More test environment playing to be done.
@CharlesReinhardt It looks like you're using an AWS CLI v2 binary built locally (something like pip install -e .) using Python 3.11.0b4. You wouldn't be able to reproduce this issue in this Python version.
Thanks @hssyoo, that was exactly my issue. I was able to set up my local environment for testing using python 3.12.6 and reproduce the issue.
I was playing around with the pywin32 module (to give us more windows-specific control over file permissions) and was able to modify the generated key file so that it doesn't upset PowerShell/Win32-OpenSSH (in the logic here).
If other fixes for this issue don't pan out (https://github.com/PowerShell/Win32-OpenSSH/issues/2317 and https://github.com/python/cpython/issues/128038), would we consider adding a dependency on pywin32 for the aws-cli? I assume we'd need to restrict the dependency to when the cli is running on Windows machines, and I'm not even sure how we would do that if we can.
@CharlesReinhardt In general, we avoid adding dependencies since they can compound existing technical debt (eg version conflicts).
If we can't get a long-term fix upstream, I'd prefer finding a simpler workaround. One possibility is to vend our own version of TemporaryDirectory that uses the 0o400 mode instead of 0o700 when calling os.mkdir. This was an idea I had off the top of my head and would be interested in other possibilities.
It would be nice if tempfile.mkdtemp() had an argument for permission mode, but that issue has been discussed and closed on the Python side (https://github.com/python/cpython/issues/86050), so I like trying to mirror the TemporaryDirectory/mkdtemp() implementation in the aws-cli for our use.
It might be possible to ditch the temporary directory completely and instead just create a temporary file for the private key using tempfile.mkstemp() which uses a hardcoded mode 0o600, which seems to allow OpenSSH to connect in my limited testing.
Possible downsides that come to mind is
- If we're putting anything else in that temporary directory before it closes, this approach might just not work.
- We're kind of leaving ourselves at the whim of how
tempfile.mkstemp(0o600)behaves on Windows. If it gets modified to OWNER_RIGHTS in the future, we'll just be back where we are now. - This approach might result in the key living on the client machine longer or shorter than our current TemporaryDirectory approach. We would have to look into if that matters.
Still having this issue with aws-cli/2.24.2 Python/3.12.6 Windows/11 exe/AMD64
I don't understand the workaround for cases where not using own long-lived keys? The whole reason I want to use EC2 Instance Connect is to take advantage of the ephemeral/auto managed keys?
@infrahead yes, unfortunately this issue still persists. Waiting to hear back from Win32-OpenSSH on this open issue (https://github.com/PowerShell/Win32-OpenSSH/issues/2317) regarding a fix.
if you'd like to avoid using long-lived keys on the instance, you can take a look at using EC2 Instance Connect on the AWS Console with instructions in Connect to your Linux instance using EC2 Instance Connect
If you'd like to connect via the CLI, you can replicate what the ssh plugin is doing behind the scenes and use aws ec2-instance-connect ssh-send-public-key to push a public key onto your EC2 instance (which will live there for 60 seconds) and use the workaround of providing your own private key in the aws ec2-instance-connect ssh --instance-id i-xxx --private-key-file mykey.pem. This does require you to maintain a long-lived private key on your personal machine or generate new keys everytime you connect, but it does avoid long-lived public keys on your EC2 instance and is currently the best workaround we have available.
@CharlesReinhardt thanks for the detailed follow up! With some more understanding of the pieces involved on my part, I already sort of arrived at your later recommendation.
I can confirm this is still and issue on Windows 11 with 2.27.36, but downgrading to 2.17.35.0 "resolves" the issue.