rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

Public Key Infrastructure for Rust Project

Open walterhpearce opened this issue 1 year ago • 10 comments

This RFC is the first of a series for a PKI model for Rust; includes the design and implementation for a PKI CA and a resilient Quorum model for the project to implement, and next steps for signing across the project. Crate and release signing will follow in a subsequent separate RFC.

Rendered

walterhpearce avatar Feb 27 '24 21:02 walterhpearce

(Preface: I apologize for the top-level comment; if this isn't the right venue for this, just let me know and I'll drop it!)

I think the overall design here is sensible, for an independent PKI.

However, I want to issue a general caution about self-deployed PKIs and operational burdens: both PKI standup and long-term maintenance are operationally complex, and fit into the middle of the iron triangle of "hard to do right", "manually intensive", and "not performed often enough to establish muscle memory." The operational overhead of recovery and rotation in PKIs also presents a "normalization of deviance" risk, in the sense that operations that are challenging to perform (such as rotating out weak keys or non-conforming certificates) get deferred due to stability/breakage risks for downstream users. This latter risk is heightened by technical constraints in the X.509 PKI ecosystem (e.g. poor real-world support for policy mapping, name constraints, mixed key/signature types in chains).

If I read the RFC correctly, it sounds like there are really three underlying motivations for a new PKI:

  1. Signing "official" Rust artifacts, such as releases of Rust itself, including eventually for the purpose of trust by OS codesigning mechanisms;
  2. Signing crate distributions (not end-user signing, but signing by the index itself);
  3. Infrastructure operations, such as internal services or unforeseen additional delegations.

If this understanding is correct, I'd like to suggest the following alternatives:

  1. To sign Rust releases, consider artifact transparency with either a fixed identity via Sigstore or a single KMS-managed key, which could then be subject to key transparency. My bias is towards encouraging Sigstore since it sidesteps the need to do any key management whatsoever, but you could also do key distribution through a .well-known URI on a dedicated host with transit security through a Web PKI CA issued certificate.

  2. As a potentially controversial opinion: I think index-side package signing has relatively little security value in the presence of strong transport security (i.e. TLS) and cryptographic digests, unless there's a firm security boundary between the signing/uploading component and the rest of the codebase. This might be the case, but it can't be taken for granted -- without that boundary, it's difficult to quantify the damage an attacker with some access to crates' infrastructure can do.

    (End-user signing is incredibly valuable, and much harder to get right (cf. PyPI's challenges getting users to sign with reasonably secure PGP keys, much less sign at all).)

    Rather than prioritize index signing, I'd like to suggest that Rust consider an artifact transparency scheme similar to Go's: Go's checksum database ("sumdb") has a proven track record in the Go ecosystem, and provides similar properties (notably, committing the index to an immutable mapping between package names and their contents).

  3. This is a big space, so I don't want to make assumptions about the total constellation of things that the Rust project may end up needing key materials or certificates for 🙂. However, in the most general case: services can and should (continue to) use the Web PKI for TLS certificates, including for internal services.

Again, I apologize if this is a premature response or an inappropriate venue for a "general" opinion like this one. I'm happy to discuss synchronously as well, based on experiences implementing X.509 and working on the same problem space within PyPI and other ecosystems.

woodruffw avatar Mar 07 '24 20:03 woodruffw

I think CT winds up much less fragile than CAs do. If git used sha256 then you gain pretty powerful forensic tools from repository history plus git hashes in the package.

I've noted above that existing web CT infrastructure could provide the ultimate CT roots, although rust would still require its own infrastructure that maintains the extension merkle trees.

burdges avatar Mar 07 '24 23:03 burdges

I think index-side package signing has relatively little security value in the presence of strong transport security (i.e. TLS) and cryptographic digests

The reason we want package signing is to allow mirrors of crates.io where there is bad or no internet connectivity with crates.io like China or airgapped systems. In those cases it is impossible to use TLS for security.

bjorn3 avatar Mar 08 '24 07:03 bjorn3

The reason we want package signing is to allow mirrors of crates.io where there is bad or no internet connectivity with crates.io like China or airgapped systems. In those cases it is impossible to use TLS for security.

In addition to network restricted scenarios, package signing can also develop a de-centralized crate distribution system in the future, thereby reducing the bandwidth costs of a single system

wangkirin avatar Mar 08 '24 09:03 wangkirin

I think index-side package signing has relatively little security value in the presence of strong transport security (i.e. TLS) and cryptographic digests

The reason we want package signing is to allow mirrors of crates.io where there is bad or no internet connectivity with crates.io like China or airgapped systems. In those cases it is impossible to use TLS for security.

These are reasonable scenarios, but to point out: in an airgapped scenario, the verifier necessarily lacks access to the revocation parts of the PKI. In a scenario with a country that interferes with secure transport: you have limited ability to prevent the country/network operator from stripping away signatures during transport, as well as securely distributing the root of trust in the first place. This compounds in complexity when mirrors needs to host their own distributions, which may not be able to receive upstream index signatures 🙂

None of these are insurmountable problems, but they're ones that we've similarly thought about in terms of PKI/signing designs for PyPI as changing the cost-benefit envelope for index signatures.

Finally, I'll note that artifact transparency in the form of a sumdb is suitable for both of these cases, and has similar security properties: a sumdb can be mirrored or placed in an airgapped environment, with the added benefit (over signatures absent reliable revocation) of being auditable by all parties that depend on it.

woodruffw avatar Mar 08 '24 14:03 woodruffw

I'm sorry to say that I think this is a bad idea.

Background

Perhaps I should introduce myself. Most relevantly, I was Chief Cryptographer at the HSM manufacturer nCipher; I've a PhD in Computer Security from Cambridge University; and I'm a longstanding Member (and former Leader) of the Debian Project.

Debian's experience is very relevant here. Debian has for decades operated one of the most successful and resilient code signing systems in existence - the apt repository and package signing system. (To be clear: I did not design or implement that system, although I did have a hand in some of Debian's internal package signing processes.)

Lack of a solution-neutral problem statement

The RFC is quite vague about why. It seems to take for granted that we should do some more signing of things, to improve software supply chain integrity presumably, and therefore we should have a "PKI".

The conclusion does not follow from the premise. Code and data signing can be done very successfuly without an X.509 PKI. We should start again from scratch with clear objectives.

Single trust root is wrong

Traditional "PKI" as proposed here tends to have a single root of trust. As is often the case, that's inappropriate for the Rust Project.

There is no necessary linkage between (for example) code signing keys used to verify downloads by Rustup of new versions of Rust (and self-updates), on the one hand, and (for example) the crates.io index, on the other.

Instead of an X.509-style PKI, with a single cryptographic trust root, the necessary public keys for each role should be embedded in the appropriate software. So, for exmple, Rustup should contain the public keys with which it verifies the things it downloads. Cargo should contain the crates.io repository public keys.

Linking these use cases (and others) into a single hierarchical structure, is a mistake. (Sharing tooling and workflows is likely to be helpful.)

Revocation should be done by key lifetimes, rollover and update

In practice, revocation turns out to be quite rare. Debian has had to do an emergency archive signing key rollover about once.

Revocation checks should be done without online revocation services (eg, OCSP). Instead, if it is necessary to revoke keys, this can be done by promulgating updated relying software: after all, the relying software contains the public keys.

Avoid X.509

The X.509 protocols are, frakly, terrible. Many of the libraries for dealing with them are poor too. X.509 has been a constant source of misbheaviours in critical security systems. It is often difficult to persuade X.509 software to apply the correct criteria; most X.509 software is designed to serve the (very poor) web PKI.

Debian and git have had great success with OpenPGP-based signing systems. Rust should choose OpenPGP. Note that we should not be using the OpenPGP "web of trust" trust model; OpenPGP implementations do support simpler trust models.

The once-leading OpenPGP implementation, GnuPG, is rather troublesome, but in the Rust world we could use Sequoia.

Don't use threshold schemes, do thresholding at the protocol level

Threshold cryptography is complicated and very limiting. For example, what will we do when we want to move to a postquantum signature scheme? We want a free choice of algorithms.

Doing thresholding at the cryptographic algorithm layer is great for getting papers into crypto conferences (lots of really advanced mathematics, yay) but is only good engineering if you don't control the relying software, so you need to publish just one public key to a naive relying party. That's not our situation: we control and distribute the relying software.

So there is no need for threshold cryptography here. Instead, to implement a k/n control, we can just publish (embed with the relying software distribution) n keys along with an instruction that k signatures are required. That keeps our mathematics as boring as possible (and affords a wider choice of algorithms and HSMs).

Real improvements

ISTM that there are indeed real improvements that could be made by making more use of digital signatures:

Right now crates.io relies on the web X.509 PKI for index signing. The web X.509 PKI is very poor. crates.io should be digitally signing the index. That would improve transparency, traceability, and mirrorability.

I don't know precisely how rustup verifies new Rust release, but IMO that ought also to be done with code signing.

So there is real scope for improvement here, but this RFC is the wrong approach.

ijackson avatar Mar 21 '24 18:03 ijackson

I agree with a lot of the above comment: if you don't need a single root of trust (or threshold signing at the protocol level), your design is better off without it.

I don't, however, agree with the OpenPGP recommendation 🙂 -- OpenPGP implementations share much of the same spotted security history as X.509 implementations do, with weaker algorithmic controls and design decisions that aren't consistent with modern cryptographic best practices (e.g. compression-before-operation, MDC). These don't all necessarily apply to digital signatures, but they are indicative of the compromises OpenPGP makes in practice for legacy support (which, for a greenfield design, should not be a significant priority for Rust).

(Some of this is ameliorated by better implementations, like Sequoia. But some of it is baked into OpenPGP-the-standard or, worse, impossible to change because GnuPG is the 800lb gorilla in the room. The recent fracas with LibrePGP is broadly indicative of the PGP ecosystem's practical inability to modernize and discharge unsafe defaults.)

Some supporting resources from well-regarded sources:

(Note the general age on these: the industry/security consensus against PGP in greenfield is well established. That consensus is also why git and other applications have (slowly) moved towards SSH based signing, minisign, age, etc. as needed.)

TL;DR: I agree that this RFC could use a better problem statement, and that an improved problem statement may reveal an architecture better than a PKI here (certainly for operational reasons that I mentioned in an earlier comment). But IMO it would be a mistake to build any subsequent architecture on any variant of PGP.

woodruffw avatar Mar 21 '24 19:03 woodruffw

Revocation should be done by key lifetimes, rollover and update

For crates.io requiring a cargo update after key rotation is not an option. To be able to bootstrap rustc, cargo versions that are many years old need to keep the ability to connect to crates.io. And users may want to keep using an older rustc (and thus cargo) version for other reasons, including but not limited to bisecting regressions, using a distro provided rustc (which almost always is very outdated) or for any other reason. This needs either a way to override the key using a config option (ideally this option is only necessary in case of a compromised key) or by signing the new key with the old key.

The once-leading OpenPGP implementation, GnuPG, is rather troublesome, but in the Rust world we could use Sequoia.

Sequoia stopped verifying rust releases on 2023-01-02 (yes, they checked the system time), causing everyone to get a warning that the signature verification failed. Luckily the verification logic was still considered experimental so it didn't actually break anyone. See https://github.com/rust-lang/rustup/pull/3186. The entire verification logic was removed in https://github.com/rust-lang/rustup/pull/3277. This RFC is supposed to create the foundations for a replacement of the check.

bjorn3 avatar Mar 21 '24 22:03 bjorn3