binenv icon indicating copy to clipboard operation
binenv copied to clipboard

Add checksum validation of downloaded archives

Open ppetr opened this issue 2 years ago • 9 comments

While downloading from GitHub via HTTPS gives a reasonable level of security, I'd still prefer to have the binaries verified against their respective checksum files.

I propose to add a field with a URL to a checksum file together with a checksum of the file itself. Example:

  foo_binary:
    fetch:
      url: https://github.com/...
    checksums:
      type: sha256  # Hash type used in checksums.txt as well as in `hash` below.
      url: https://github.com/foo/bar/releases/.../checksums.txt
      hash: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

After downloading an archive for foo_binary, binenv would also download the checksums.txt file, verify its integrity, and then verify the integrity of the archive against the appropriate checksum in checksums.txt.

An alternative would be to include all the hashes in distributions.yaml itself, but I think it'd be way too verbose.

I'm happy to contribute a PR once an agreement is reached on the details.

ppetr avatar Jul 12 '22 21:07 ppetr

Hello @ppetr

That would be great.

However I've left this aside for now because I fear there will be a lot of nitty gritty details (checksums for compressed binaries, checksums inside tarballs, non standard checksum file formats, ...).

If you want to tackle this please go ahead. The above proposal seems fine to me. Keep us informed !

leucos avatar Jul 13 '22 07:07 leucos

Great! Let's keep it simple, just to verify checksums of tarballs as they're already provided. Later we can think of expanding it, if needed.

I'll then start working on a prototype and I'll keep you updated 🙂.

ppetr avatar Jul 13 '22 18:07 ppetr

After some experimenting I came to the following ideas, which can be implemented relatively independently.

Provide a checksum of a checksum file for each released version. This is the simplest solution and probably easiest to work with for authors/maintainers, but it's a bit more verbose in the distribution file.

fzf:
  description: fzf is a general-purpose command-line fuzzy finder.
  url: https://github.com/junegunn/fzf/
  list:
    type: github-releases
    url: https://api.github.com/repos/junegunn/fzf/releases
  fetch:
    url: https://github.com/junegunn/fzf/releases/download/{{ .Version }}/fzf-{{ .Version }}-{{ .OS }}_{{ .Arch }}.tar.gz
  integrity:
    url: https://github.com/junegunn/fzf/releases/download/{{ .Version }}/fzf_{{ .Version }}_checksums.txt
    checksums:
      - url: https://github.com/junegunn/fzf/releases/download/0.30.0/fzf_0.30.0_checksums.txt
        type: sha256
        checksum: 43cc37783e0bf4ed775109379b3e2073ea2bb29c9e4811d07907c868435e1b7e
      # Other versions follow.
  install:
    type: tgz
    binaries:
      - fzf

Use OpenPGP to sign chechsum files. This is often done by authors that already use PGP.

A random example: https://github.com/orgrim/pg_back/releases/tag/v2.1.0. The release includes the checksums.txt and its signature checksums.txt.asc. Then it'd be enough to provide the public key(s) of the author(s) once and use it to verify any of their releases:

  integrity:
    url: https://github.com/junegunn/fzf/releases/download/{{ .Version }}/fzf_{{ .Version }}_checksums.txt.sig
    public_key:
      # This would require https://gopenpgp.org/, probably a more heavy-weight library.
      openpgp:
        - |
          -----BEGIN PGP PUBLIC KEY BLOCK -----
          ...

ppetr avatar Jul 17 '22 10:07 ppetr

So my questions are:

  1. Are these options (or one of them) reasonable to implement?

  2. For OpenPGP is it acceptable to add this non-trivial dependency? We could also resort to calling gpg externally, but that feels to me a bit against the spirit of binenv which is otherwise very self-contained.

    Another interesting alternative could be saltpack, but its drawback is that it's very new, so its adoption rate would probably be much smaller.

ppetr avatar Jul 17 '22 12:07 ppetr

Well, I think I did not understand your initial proposal.

I thought you wanted to grab the checksums (when they existed) from the released artifacts and compare to what has been downloaded by binenv install.

But it seems you'd like to add checksums for all versions in distributions.yaml.

I do not think we should be the custodians of fingerprints, this is too much responsibilities (and also, quite a chore to maintain; try a make e2e in the repo and you'll feel the pain).

So I am not convinced we're heading the right way here.

leucos avatar Jul 19 '22 01:07 leucos

I see your point.

From reliability perspective, checking against checksum files in releases might help a bit, but I guess modern https is very good already in ensuring reliable transmission. And since files are always compressed, their internal integrity is verified by mechanisms such as CRCs built in decompression algorithms.

My perspective is rather security. Imagine let's say a GitHub account of a very popular binary becomes compromised. An attacker can replace/create a release with corrupted binaries as well as matching hashes. Then thousands of computers will became infected by malware.

I agree that maintaining checksums of individual files is just not maintainable.

Then how about authors' PGP public keys and/or fingerprints? This means adding just one string once for every eligible project that won't change over time (or extremely rarely). This information could be even scraped automatically for example from project README.md files (if present). But with an important feature that they'd never be changed by automation once present. Then:

  • If the author signed releases with the corresponding PGP key, binenv could verify them automatically for all releases without further intervention.
  • If an attacker compromises a GitHub account, they won't be able to sign new/changed releases. And even if they change the PGP fingerprint in the README.md file, binenv automation won't accept the change without manual inspection.

That way a reasonable level of security can be reached and hopefully with little intrusion.

ppetr avatar Jul 19 '22 20:07 ppetr

Interesting. Do you have examples of such signed releases ?

leucos avatar Jul 22 '22 07:07 leucos

Let me give a couple of examples:

  • https://github.com/wireapp/wire-desktop/releases/tag/linux%2F3.28.2946
  • https://github.com/syncthing/syncthing/releases/tag/v1.20.4-rc.1
  • https://github.com/dundee/gdu/releases/tag/v5.14.0

The checksums are either given as a single file (usually .txt.asc) which contains both the original checksums as well as the signature, or as a pair of files (usually .txt and .txt.asc), where the latter contains a detached signature of the former. More details can be found here: https://www.gnupg.org/gph/en/manual/x135.html

I also found out that the fingerprint of the signer's key can be extracted from a signature (https://security.stackexchange.com/q/62916/12485). Which means it'd be possible to collect these fingerprints from all projects that contain such a signature without requiring the authors to publish the key somewhere, making the process even more seamless.

ppetr avatar Jul 24 '22 12:07 ppetr