pyinfra icon indicating copy to clipboard operation
pyinfra copied to clipboard

Known_hosts file loses content when adding new host

Open pikeas opened this issue 1 year ago • 5 comments

Describe the bug

Pyinfra deletes lines from ~/.ssh/known_hosts.

To Reproduce

https://github.com/pyinfra-dev/pyinfra/blob/3.x/pyinfra/connectors/sshuserclient/client.py#L43-L49:

with HOST_KEYS_LOCK:
    host_keys = client.get_host_keys()
    host_keys.add(hostname, key.get_name(), key)
    # The paramiko client saves host keys incorrectly whereas the host keys object does
    # this correctly, so use that with the client filename variable.
    # See: https://github.com/paramiko/paramiko/pull/1989
    host_keys.save(client._host_keys_filename)

This happens because of the way Paramiko parses the file - it throws away comment lines, newlines, and lines it doesn't understand.

https://github.com/paramiko/paramiko/blob/main/paramiko/hostkeys.py#L89-L97:

with open(filename, "r") as f:
    for lineno, line in enumerate(f, 1):
        line = line.strip()
        if (len(line) == 0) or (line[0] == "#"):
            continue
        try:
            entry = HostKeyEntry.from_line(line, lineno)
        except SSHException:
            continue

So, when Pyinfra calls hosts_keys.save(...), this doesn't just add a new entry, it actually overwrites the file with the contents as parsed by Paramiko. This is unexpected - I've been fighting disappearing contents in known_hosts for years and had no idea it was caused by PyInfra.

Expected behavior

Pyinfra should either never modify known_hosts, or only add new hosts.

I understand the intent here, to make running PyInfra as convenient as possible. But IMO this behavior is a foot-gun for infrastructure maintainers and should be changed. It's frustrating to lose formatting, painful to lose comments, and breaks general SSH usage when @cert-authority (not yet support by Paramiko) silently disappears.

pikeas avatar Sep 19 '24 17:09 pikeas

Hi,

I created a possible fix for this, making it so that pyinfra will only append to the known_hosts file. Do you have the possibility of testing it with a known_hosts file that is known to break? Alternatively providing an example that demonstrates the behavior. I would like to add a test for this to ensure we don't repeat the behavior, as I agree that this has a very negative impact for certain users.

evoldstad avatar Oct 01 '24 19:10 evoldstad

Here's a reproduction:

$ mv ~/.ssh/known_hosts ~/.ssh/know_hosts.bak
$ cd (mktemp -d)
$ rye init # or other project setup, such as pip
$ rye add pyinfra

$ cat <<EOF > ~/.ssh/known_hosts
# this is an important comment

# another comment after the newline

@cert-authority *.my-other-domain.com <key type> <key>
EOF

$ rye run pyinfra server.example.com exec -- echo "hello world"
--> Loading config...
--> Loading inventory...
--> Connecting to hosts...
    No host key for server.example.com found in known_hosts, accepting & adding to host keys file
    [server.example.com] Connected

<work happens>

--> Disconnecting from hosts...

$ cat ~/.ssh/known_hosts
server.example.com <key type> <key>

The known_hosts file is overwritten, which silently removes the comments, newlines, and cert-authority lines.

pikeas avatar Oct 02 '24 16:10 pikeas

https://github.com/pyinfra-dev/pyinfra/compare/3.x...evoldstad:pyinfra:known_hosts_fix#diff-e12fd39e569809c21098f098715232f682ab3fd99ed42a24b266c75f8c5552acR65-R76

In the fix under AskPolicy, the host addition should likely be moved inside the "y" confirmation.

pikeas avatar Oct 02 '24 19:10 pikeas

(Sorry for the late response, got more busy than expected)

Do you mean this part?

        if should_continue.lower() != "y":
            raise SSHException(
                "AskPolicy: No host key for {0} found in known_hosts".format(hostname),
            )

As it will raise an exception if the answer is not "y" I think it is correct (unless I'm missing something?)

(I'm currently finishing writing the test-case so that future commits won't accidentally reintroduce the bug, thanks for your feedback!)

evoldstad avatar Oct 08 '24 17:10 evoldstad

(Sorry for the late response, got more busy than expected)

No apology needed, your hard work as an OSS maintainer is appreciated!

As it will raise an exception if the answer is not "y" I think it is correct (unless I'm missing something?)

You're right, this appears to be correct. Thanks again!

pikeas avatar Oct 08 '24 19:10 pikeas

After some (dumb) mistakes with linting and commits, I finally have a working pull request submitted. It seems that there is an issue with the spell-check linter though, as it thinks a random part of the example ssh-key I generated is misspelled... Hopefully someone can make an exception or show how to work around this, because I'm not sure I can do much about it except generate a new random key and hope that one doesn't get flagged by the spell-checker?

evoldstad avatar Dec 02 '24 21:12 evoldstad

Issues with pull-request was finished yesterday and the fix is now waiting to be merged :) Request: https://github.com/pyinfra-dev/pyinfra/pull/1252

evoldstad avatar Dec 09 '24 11:12 evoldstad

@pikeas , the request was successfully merged and this issue should now be fixed :)

evoldstad avatar Dec 15 '24 15:12 evoldstad

Thanks for fixing this!

pikeas avatar Dec 16 '24 01:12 pikeas

Is this fix in the latest release? I just had my known_hosts file eaten alive. Thank goodness for backups

stratacast avatar Dec 31 '24 01:12 stratacast

Released in v3.2 (https://github.com/pyinfra-dev/pyinfra/releases/tag/v3.2).

Fizzadar avatar Apr 22 '25 14:04 Fizzadar