zellwk.com
zellwk.com copied to clipboard
Insecure advice in ssh/rsync Github Action post
First off, thanks so much for this post, it saved me a bunch of time trying to figure out how to do this exact thing.
The specific post I am discussing is: https://github.com/zellwk/zellwk.com/blob/master/src/posts/2021-03-17-github-actions-deploy.md
The only issue I ran into with it was in how your instructions say to set up a bogus known_hosts
entry, and then run a script to populate it based on whatever is responding at that domain name when the script runs. This completely defeats the purpose of the known_hosts
security mechanism, it allows anyone who is masquerading at that host when the script runs to steal the connection.
What you actually want to do is search through the known_hosts
file on your local machine for the line that begins with the domain name of the host you are trying to deploy to, that you have successfully used ssh
to connect to from your local machine, so you know that the known_hosts
signature is correct. There should be only one such line in the file, and you just copy that entire line into your Action. Here’s an example of what that looks like in the Action I just got working thanks to your help: https://github.com/Deep-Symmetry/bytefield-svg/blob/58e64fbb71c482abbbc85d15ee61024dfd9151ca/.github/workflows/guide.yml#L21
@brunchboy Thanks for the message. But I'm not sure why said I mentioned using a bogus known_hosts
file?
In the article, I specifically mentioned a step to use the correct known_hosts
value here: https://zellwk.com/blog/github-actions-deploy/#step-5%3A-adding-a-correct-known_hosts-value
Hi, @zellwk! The concern I am raising is that you start out with a fake known_hosts
value (which is not the problem) and then use ssh-keyscan
to add one. It is this use of ssh-keyscan
which is the security risk: that will accept the key of any host that is currently at that IP address. It is better to put the actual known_hosts
value at step 4, and not use ssh-keyscan
. You can find the correct value manually by looking in your own known_hosts
file as I described above.
@brunchboy Alright I kinda get what you're talking about, but I'm not sure what you mean by "will accept the key of any host that is currently at that IP address"
I thought ssh-keyscan
would require ip address of the server we're trying to ssh into. If that's the case, I don't see a problem with it — especially since the server still needs to verify whether your key is correct.
In working with cryptographic software it is always important to think about what the threat model is, and what the most difficult thing is for an attacker to subvert. The purpose of known_hosts
is to prevent someone from tricking you into thinking you are talking to your own server, when it is actually a hostile server. That is why the first time you connect to a machine, ssh wants you to manually verify the host key. There’s nothing preventing someone reading these instructions from giving a domain name rather than an IP address to ssh-keyscan
and then it will try to use that domain name. DNS is fairly easy to subvert, especially since so few people are using DNSSEC. You are correct that IP addresses are more secure than domain names, but IP squatting is still possible. ssh is more secure than either domain names or IP addresses if used correctly, but using ssh-keyscan
in an automated manner like this weakens one of the protections offered by ssh.
Probably in most cases, worrying about this is overkill. But given that it is easy to fix your instructions to explain to people how to manually determine the host key of their own server, and use that in the first place with ssh-key-action
—in the way that the author of that action intended—I am suggesting it would be an improvement in the security posture of your instructions to do this, which would benefit readers by showing them best practices, and avoid putting bad habits into heir heads.
@brunchboy I'm not disagreeing with you. I'm also not unwilling to change my post. It's my responsibility to understand exactly what's the issue when I write articles — this is why I'm asking more before I change anything.
Your message isn't showing me why writing a known_hosts
variable with Github Actions is the problem.
You're saying the purpose is to prevent someone from tricking you into thinking you're talking to your own server — that's fair. But the problem is known_hosts
value comes from you. The ip address
also comes from you in this case. Where is the malicious tricker?
The idea is to retain control of the known_hosts
value. Using ssh-keyscan
lets whatever host happens to be responding to that IP address when the script runs control the known_hosts
value. IP squatting would allow the trickster to have their host respond, rather than the one you think should own that IP address. As I said, this is a difficult attack, but it is exactly the attack that known_hosts
is designed to prevent. If you can retain the security of using an explicit value that you control for known_hosts
, you can protect against IP squatting. If you don’t care about IP squatting, then you can use ssh-keyscan
as you do, but I’d recommend being explicit about that tradeoff. And it’s not hard to find the right value to put in known_hosts
without using ssh-keyscan
. That’s all I am saying.
Cloud squatting happens when a company leases space and IP addresses on a public server, uses them, releases the space, and sends them back to the cloud vendors. The server space providers such as Amazon, Google, or Microsoft assign the same addresses to another company.
If you hold and retain control of the server, there's no way the IP value can be squatted — so there's really no problem using ssh-keyscan
at all.
We disagree, which is why I implemented my CI more securely. I will make one last effort and leave. If it is not an issue, why was known_hosts
even implemented? Your situation may be sufficiently secure, but can you be certain everyone following your advice is in the same situation? Can you think of no way of manipulating the network to route an IP address to the wrong host? Consider corrupt network providers and state actors. This is why I suggested revising your example to follow ssh best practices.
Regardless, thank you again for the example, it helped me solve a real problem I faced.
Cheers,
-James