zellwk.com icon indicating copy to clipboard operation
zellwk.com copied to clipboard

Insecure advice in ssh/rsync Github Action post

Open brunchboy opened this issue 2 years ago • 8 comments

First off, thanks so much for this post, it saved me a bunch of time trying to figure out how to do this exact thing.

The specific post I am discussing is: https://github.com/zellwk/zellwk.com/blob/master/src/posts/2021-03-17-github-actions-deploy.md

The only issue I ran into with it was in how your instructions say to set up a bogus known_hosts entry, and then run a script to populate it based on whatever is responding at that domain name when the script runs. This completely defeats the purpose of the known_hosts security mechanism, it allows anyone who is masquerading at that host when the script runs to steal the connection.

What you actually want to do is search through the known_hosts file on your local machine for the line that begins with the domain name of the host you are trying to deploy to, that you have successfully used ssh to connect to from your local machine, so you know that the known_hosts signature is correct. There should be only one such line in the file, and you just copy that entire line into your Action. Here’s an example of what that looks like in the Action I just got working thanks to your help: https://github.com/Deep-Symmetry/bytefield-svg/blob/58e64fbb71c482abbbc85d15ee61024dfd9151ca/.github/workflows/guide.yml#L21

brunchboy avatar Feb 27 '23 00:02 brunchboy

@brunchboy Thanks for the message. But I'm not sure why said I mentioned using a bogus known_hosts file?

In the article, I specifically mentioned a step to use the correct known_hosts value here: https://zellwk.com/blog/github-actions-deploy/#step-5%3A-adding-a-correct-known_hosts-value

zellwk avatar Mar 07 '23 07:03 zellwk

Hi, @zellwk! The concern I am raising is that you start out with a fake known_hosts value (which is not the problem) and then use ssh-keyscan to add one. It is this use of ssh-keyscan which is the security risk: that will accept the key of any host that is currently at that IP address. It is better to put the actual known_hosts value at step 4, and not use ssh-keyscan. You can find the correct value manually by looking in your own known_hosts file as I described above.

brunchboy avatar Mar 08 '23 00:03 brunchboy

@brunchboy Alright I kinda get what you're talking about, but I'm not sure what you mean by "will accept the key of any host that is currently at that IP address"

I thought ssh-keyscan would require ip address of the server we're trying to ssh into. If that's the case, I don't see a problem with it — especially since the server still needs to verify whether your key is correct.

zellwk avatar Mar 08 '23 03:03 zellwk

In working with cryptographic software it is always important to think about what the threat model is, and what the most difficult thing is for an attacker to subvert. The purpose of known_hosts is to prevent someone from tricking you into thinking you are talking to your own server, when it is actually a hostile server. That is why the first time you connect to a machine, ssh wants you to manually verify the host key. There’s nothing preventing someone reading these instructions from giving a domain name rather than an IP address to ssh-keyscan and then it will try to use that domain name. DNS is fairly easy to subvert, especially since so few people are using DNSSEC. You are correct that IP addresses are more secure than domain names, but IP squatting is still possible. ssh is more secure than either domain names or IP addresses if used correctly, but using ssh-keyscan in an automated manner like this weakens one of the protections offered by ssh.

Probably in most cases, worrying about this is overkill. But given that it is easy to fix your instructions to explain to people how to manually determine the host key of their own server, and use that in the first place with ssh-key-action—in the way that the author of that action intended—I am suggesting it would be an improvement in the security posture of your instructions to do this, which would benefit readers by showing them best practices, and avoid putting bad habits into heir heads.

brunchboy avatar Mar 08 '23 04:03 brunchboy

@brunchboy I'm not disagreeing with you. I'm also not unwilling to change my post. It's my responsibility to understand exactly what's the issue when I write articles — this is why I'm asking more before I change anything.

Your message isn't showing me why writing a known_hosts variable with Github Actions is the problem.

You're saying the purpose is to prevent someone from tricking you into thinking you're talking to your own server — that's fair. But the problem is known_hosts value comes from you. The ip address also comes from you in this case. Where is the malicious tricker?

zellwk avatar Mar 15 '23 02:03 zellwk

The idea is to retain control of the known_hosts value. Using ssh-keyscan lets whatever host happens to be responding to that IP address when the script runs control the known_hosts value. IP squatting would allow the trickster to have their host respond, rather than the one you think should own that IP address. As I said, this is a difficult attack, but it is exactly the attack that known_hosts is designed to prevent. If you can retain the security of using an explicit value that you control for known_hosts, you can protect against IP squatting. If you don’t care about IP squatting, then you can use ssh-keyscan as you do, but I’d recommend being explicit about that tradeoff. And it’s not hard to find the right value to put in known_hosts without using ssh-keyscan. That’s all I am saying.

brunchboy avatar Mar 15 '23 03:03 brunchboy

Cloud squatting happens when a company leases space and IP addresses on a public server, uses them, releases the space, and sends them back to the cloud vendors. The server space providers such as Amazon, Google, or Microsoft assign the same addresses to another company.

If you hold and retain control of the server, there's no way the IP value can be squatted — so there's really no problem using ssh-keyscan at all.

zellwk avatar Mar 18 '23 03:03 zellwk

We disagree, which is why I implemented my CI more securely. I will make one last effort and leave. If it is not an issue, why was known_hosts even implemented? Your situation may be sufficiently secure, but can you be certain everyone following your advice is in the same situation? Can you think of no way of manipulating the network to route an IP address to the wrong host? Consider corrupt network providers and state actors. This is why I suggested revising your example to follow ssh best practices.

Regardless, thank you again for the example, it helped me solve a real problem I faced.

Cheers,

-James

brunchboy avatar Mar 18 '23 06:03 brunchboy