crates.io icon indicating copy to clipboard operation
crates.io copied to clipboard

Security model / TUF

Open tarcieri opened this issue 9 years ago • 91 comments

The fun thing about packaging systems with central package directories is the central package directories have this annoying tendency to be compromised. There have been a few such notable compromises in recent history, such as RubyGems and npm. Fortunately no serious problems resulted in either of these attacks they were both detected early, but a more sinister attack could go undetected, poisoning the package repository and spreading malware.

One way to stop this is to move the source of authority for the integrity of packages from the package repository to the developers of packages. However, managing keys is hard, and many people simply won't want to do this. Furthermore, you have to worry about how to retrofit the existing packages into this model if your packaging system didn't launch with developer-managed keys from day one (which Cargo didn't)

There's a system that solves all these problems called The Update Framework (TUF), collaboratively developed by both Tor developers and academics:

http://theupdateframework.com/

The Update Framework allows developers to opt-in to managing their own keys. High profile packages can be signed by developers: specifically, TUF supports "threshold signatures" so k / n developers are needed to countersign a package in order for it to count as released. However, not everyone is forced to manage their own keys: people who don't want to can have their packages signed by the package repository instead.

TUF secures developer keys by having developers who own "unclaimed" packages request to associate some signing keys with them. A system administrator then periodically (once a week or other tolerable time interval) signs these developer keys with an offline key (or keys, TUF uses threshold signatures everywhere). At this point, these packages move from "unclaimed" to "claimed", and become what TUF calls a "delegated target": the developers, not the packaging system, become the source of truth for that particular package.

For more information, I suggest you read their paper "Survivable Key Compromise In Software Update Systems":

http://freehaven.net/~arma/tuf-ccs2010.pdf

I think a system like TUF can easily be retrofitted to Cargo as it exists today. There are a few changes I might recommend before you try to add TUF, but I think you're off to a good start.

However, if you did want to use something like TUF, it does figure into the overall AuthZ model of the system. There are a number of outstanding AuthZ issues / suggestions like #48 and #58. If you do want to integrate a system like TUF where developers manage their own keys, it will definitely influence whatever AuthZ model you adopt, because TUF moves things like authorization and integrity partly to the client.

I worked on adding TUF to RubyGems at one point and liked it, although we never finished. The people behind it worked on adding it to PyPI, and were very helpful with our efforts to add it to RubyGems.

tarcieri avatar Nov 24 '14 01:11 tarcieri

Hi Rust community,

I am one of the developers working on the The Update Framework (TUF). If any Cargo developers would like to discuss the TUF project, we can be reached at [email protected]. We would be happy to answer any questions and start a dialogue on our mailing list, here, or over video/voice chat.

In addition to the documentation provided by Tony Arcieri (@tarcieri), you may find our integration proposal for the Python Package Index (PyPI) interesting. The Surviving a Compromise of PyPI proposal for PyPI provides information on how the framework can be integrated with a community repository, and also includes an overview of the framework. The proposal (broken into two parts) can be found here:

PEP 458: http://legacy.python.org/dev/peps/pep-0458/ PEP 480: http://legacy.python.org/dev/peps/pep-0480/

vladimir-v-diaz avatar Dec 04 '14 23:12 vladimir-v-diaz

Thanks for the offer @vladimir-v-diaz! I'll be sure to reach out to you if we need help :)

For now I'm going to be focusing a large amount of effort towards Rust's standard library, so this may be postponed for awhile, but I'd love to see something like this implemented!

alexcrichton avatar Dec 05 '14 09:12 alexcrichton

Some proposals in rust-lang/cargo#1281.

Is there any news for this issue? It would be good to have a trusted crates.io-index for Rust 1.0.

l0kod avatar Feb 08 '15 12:02 l0kod

I've had this tab open for months. If someone else spearheads I can work on it but no time to champion.

On Sunday, February 8, 2015, Mickaël Salaün [email protected] wrote:

Some proposals in rust-lang/cargo#1281 https://github.com/rust-lang/cargo/issues/1281.

Is there any update for this issue? It would be good to have a trusted crates.io-index for Rust 1.0.

— Reply to this email directly or view it on GitHub https://github.com/rust-lang/crates.io/issues/75#issuecomment-73408297.

richo avatar Feb 09 '15 09:02 richo

@l0kod I have done no work on this but would be interested in collaborating. I helped write a partial implementation of TUF for RubyGems, but we never managed to carry it over the finish line.

tarcieri avatar Feb 09 '15 16:02 tarcieri

Is this being worked on? I've recently been interested in verifiable provenance of software artefacts, and would be much more comfortable if Rust and Crates.io had a better story here. I'd be willing to help with drafting an RFC drawing on PEP 458 and PEP 480 mentioned above. This is not an area of expertise for me, so I'll need to consult others, but it sounds like some with more experience are interested in helping.

kamalmarhubi avatar May 02 '15 12:05 kamalmarhubi

@kamalmarhubi to my knowledge this is not actively being worked on unfortunately

alexcrichton avatar May 03 '15 07:05 alexcrichton

@kamalmarhubi if you're interested in working on this, the creators of TUF are at least pretty responsive on their mailing list:

https://groups.google.com/forum/#!forum/theupdateframework

tarcieri avatar May 03 '15 16:05 tarcieri

@kamalmarhubi We'd be happy to collaborate with you on an RFC draft for the Rust community. Did we meet at PyCon 2015?

As @tarcieri suggested, our mailing list would be a good place to start a discussion. @JustinCappos @dachshund and I are available to work with you on this draft.

Note: The Python proposal was recently updated and is available here

vladimir-v-diaz avatar May 03 '15 20:05 vladimir-v-diaz

@tarcieri thanks for the link!

@vladimir-v-diaz I was definitely at PyCon, but I'm terrible at names and faces. To help narrow things down: I was session chair for a couple of sessions, and spent a lot of time in the green room. I am sad I missed the poster on this topic though!

What's the best venue to get in touch about this topic? This bug? TUF mailing list? Somewhere else?

kamalmarhubi avatar May 04 '15 23:05 kamalmarhubi

@kamalmarhubi I'd recommend the TUF mailing list as a starting place. This is something I'm interested in working on, but I'd also like to help finish up applying TUF to RubyGems before I'd have time to start working on a TUF implementation for Rust.

I'm familiar with both TUF and Rust though, as well as Cargo, and just generally am way too overinterested in cryptography and infosec so at the very least I can help consult / review code if you'd like to do the implementation work! :wink:

tarcieri avatar May 04 '15 23:05 tarcieri

@tarcieri sounds good. I'll send an intro email there soon enough. :-)

kamalmarhubi avatar May 05 '15 15:05 kamalmarhubi

Any progress on this? Two days ago, Rust 1.5 was published, providing a cargo install command. With this command unsigned code will be downloaded, compiled and installed. If the git repo isn't provided on an encrypted connection, it will even download, compile and install unsigned code from an unencrypted (completely insecure) connection. Are we back to the 90ies, when all downloads were unsigned and insecure?

I suggest some immediate changes:

  1. warn for every single piece of software that it is unsigned (and thus horribly insecure to run)
  2. don't allow to git checkout over unencrypted connections at all. Except when code is signed (see 4.)
  3. suggest (later version: require) crates releases to be signed by the crate author
  4. use Web of Trust or PKCS to verify signatures; fall back to TOFU (trust on first use) after warning and confirmation by the user.

As long as these (or equivalent) measures aren't taken, every developer and user of rust-based software out there is at high risk.

genodeftest avatar Dec 12 '15 13:12 genodeftest

@genodeftest perhaps we need to back up and look at the actual threats:

With this command unsigned code will be downloaded, compiled and installed. If the git repo isn't provided on an encrypted connection, it will even download, compile and install unsigned code from an unencrypted (completely insecure) connection.

So your problem is that cargo install --git http://... allows people to install crates over plaintext HTTP (thereby enabling a MitM to swap out the code)? This issue is about data-at-rest authentication of published crates, which isn't really possible with unpublished crates fetched via git. I would agree that if someone does cargo install over http:// printing a warning might be in order, or even disallowing this behavior at all. I think that's the topic for a separate issue, though.

All that said, as far as I know, beyond that cargo publish and cargo install are both delivered over https://. So the threat ceases to be a MitM, but an attacker who can compromise crates.io. Such compromises do happen frequently: the main thing that spurred my interest in TUF was the RubyGems compromise, and similar attacks have happened to many major languages' package repositories at some point.

But I think you're slightly overstating the severity of the issue, and making a bunch of recommendations that are half-measures that wouldn't significantly improve the security of the system but would certainly harm the cargo user experience.

To go through these:

  1. Web-of-Trust (the OpenPGP model): developers create signing keys, and perhaps meet up at conferences and have a keysigning party. Everyone comes home with the keys of a few developers they trust, and can use that to bootstrap the trust model of the system This is the ideal of the early cypherpunk days, unfortunately it has this painful problem of wasting everyone's time without ever really working as effectively as people hoped it would. As a small anecdote, most of the people I trust in the security community tend to view WoT and keysigning parties rather skeptically. I think there may be a deep-rooted flaw in the approach: I'm not really sure which friends or fellow developers I actually trust to delegate trust to, and it only takes one bad trust delegation to poison the whole web, unless for every bad delegation someone else exists to identify it. When this happens though, as a user I'm just told the WoT is conflicting: what do I do now? It's not clear what this means: is something malicious happening? Did someone make a mistake? What are my action items as a user? You can contact the people in question, but will they even remember why they signed a particular key? The WoT makes every member into a certificate authority, but it turns out being a certificate authority is hard and involves, among other things, due diligence in checking identities and meticulous record-keeping.

  2. Public Key Infrastructure (the X.509 model): (you said PKCS, but this is what I assume you meant): under this model, a central certificate authority would sign developer keys, hopefully including some name constraints around what packages they're allowed to sign. RubyGems implemented PKI-based gem signing, but punted on the hard problem: who runs the CA? Further difficult problems: what do we do with all the packages that aren't signed?

TUF implements what is effectively a non-X.509 PKI specifically designed for the purposes of authorizing developers to sign certain packages in the system, and without the added undue complexity needed to handle non-codesigning use cases.

In a TUF PKI, someone like Mozilla would run a developer CA. Developers would provide some form of authentication along with a public key to be signed. The operators of the CA can then sign the developer keys offline (using e.g. an HSM, Yubikey, etc) and publish the signed keys via crates.io.

  1. Trust on First Use (TOFU): prompt the user to verify each and every public key fingerprint, or blindly accept them for the user, and store these fingerprints away for eternity hoping they don't change. When they do change, pop up a scary warning message, and hope that incentivizes the user to do enough investigation to find out why the key changed. TOFU ultimately ends up conditioning users to ignore these messages, because key rotation happens quite frequently and the ratio of developers to crates will likely continue to be 1:1.

I find that #1 is impractical and doesn't typically end up working out the way people would like it to. #3 doesn't add much effective security, and has a poor user experience, constantly prompting the user whenever there's any breakage in key continuity, and providing no additional context as to what happened.

So, my vote is for a centrally managed PKI, and more specifically for TUF. As one last note:

TUF is a hybrid system which allows "unclaimed" packages to be centrally signed by the packaging system, and "claimed" packages to be signed by one or more developers. This means under TUF, all packages are signed in some form. Even packages published before the addition of the signing system can be retroactively signed.

I think unless you find some way to solve this problem, a package signing system has little value.

@genodeftest I sense some urgency on your part, but this is a problem I'd really rather see done right as opposed to rushed out the door just to have "something". Unless it solves the problem of signing every crate, and doing so in a way that does not require every developer to get a cryptographic key to publish crates (impossible already as crates have already been published without associated cryptographic keys), and it does not require the user make lots of decisions about which keys to trust or how to bootstrap the system trust model, I don't think it's helpful.

So those are my two requirements for a good package signing system:

  1. Doesn't require substantial changes to the existing crates.io workflows: should largely be transparent
  2. Signs every single package in the system, even retroactively, in some form or another

tarcieri avatar Dec 12 '15 19:12 tarcieri

  1. about cargo install --git http://something No, you don't need to specify the --git option. Any git repo in crates.io index is a potential threat if its git URL is not encrypted. The attacker may not need to break the index, he can put his stuff MitM in a git repo loaded by cargo install via http. Did I get something wrong here?

  2. and 2): Both OpenPGP and X.509 protect from a random attacker with limited capacities. I know that both don't protect from NSA&co due to their centralism (Who runs the CA?) or their diversity (OpenPGP). But they'd help against basic attacks. TUF looks like a good solution to that.

I forgot one point: even if a git repo is fetched via HTTPS, how can we be sure that nobody put other commits into a git repo? Seems like TUF has some solution to that too. But it isn't implemented yet.

@genodeftest I sense some urgency on your part, but this is a problem I'd really rather see done right as opposed to rushed out the door just to have "something". Unless it solves the problem of signing every crate, and doing so in a way that does not require every developer to get a cryptographic key to publish crates (impossible already as crates have already been published without associated cryptographic keys), and it does not require the user make lots of decisions about which keys to trust or how to bootstrap the system trust model, I don't think it's helpful.

You could sign those git commits afterwards. That's how many people do releases: Put a git tag up, create an archive and sign the archive. If the crates index could do the signing, it must make sure that only the intended author can submit this archive/checksum/whatever.

Sorry, I was kinda overreacting, possibly because I didn't get the features yet implemented. I think this issue is urgent though.

So those are my two requirements for a good package signing system:

  1. Doesn't require substantial changes to the existing crates.io workflows: should largely be transparent

+1

  1. Signs every single package in the system, even retroactively, in some form or another

But do some authentication against the author to do that.

genodeftest avatar Dec 13 '15 00:12 genodeftest

Any git repo in crates.io index is a potential threat if its git URL is not encrypted

I think this can be mitigated by blocking or warning on plaintext git repos. Perhaps they can be blocked by default, with an option to allow crazy people to use plaintext git. Anyway, I think that should be handled as a separate issue than this one.

I forgot one point: even if a git repo is fetched via HTTPS, how can we be sure that nobody put other commits into a git repo? Seems like TUF has some solution to that too. But it isn't implemented yet.

I think in general handling signatures with git repos will be difficult. A commit hash for a dependency can be included in the signed metadata for a given crate, but that only helps the case where a signed crate references another crate over git.

For the case of cargo install --git though, I think things get a lot trickier. Whose key do you use to authenticate that URL?

If "TUF for Rust" used GPG to implement signatures and had a keyring of anyone who's ever published a crate to crates.io, you could validate signed commits or tags against that GPG keyring, but without knowing specifically which key to trust the best thing you can do is tell the user you found some key in the keyring, and then ask the user to make a decision about whether it's the correct one. I think these systems which rely on user choice for security don't really add a lot of value.

But do some authentication against the author to do that.

Exactly how much identity verification will happen is really up to whoever runs the developer CA (just gonna say Mozilla from here on out). I expect they will want the CA to be completely automated.

From what I can tell, OAuth2 with GitHub is the only source of authentication / authorization that crates.io presently uses. It'd be nice if authorizing a given public key for a particular crates.io account had some degree of authentication beyond that. crates.io seems to pull in email addresses from GitHub after you first link an account, so it could at least be a combination of being authorized with OAuth2 and clicking a particular link in your email.

I expect Mozilla lacks the time and resources to do any sort of non-automated identity verification beyond that.

tarcieri avatar Dec 13 '15 00:12 tarcieri

Crates on crates.io referencing other crates in git repos shouldn't be a problem anyway as crates.io requires that all dependencies of a crate must be on crates.io as well.

If there was a way for me to assign a key to my crates.io account and sign all my crates with my key when publishing them, I'd definitely be in favor of that.

retep998 avatar Dec 13 '15 04:12 retep998

@retep998 I haven't actually tried this, but can you publish a crate that has a git dependency?

tarcieri avatar Dec 13 '15 06:12 tarcieri

@tarcieri Nope, it will reject it.

retep998 avatar Dec 13 '15 06:12 retep998

@retep998 if that's the case, it simplifies the security model to static files only, which is a great property

tarcieri avatar Dec 13 '15 07:12 tarcieri

Another (very conservative) approach would be to deprecate version wildcards. This would also improve stability.

ticki avatar Jan 17 '16 20:01 ticki

@Ticki We're already deprecating version wildcards on crates.io!

retep998 avatar Jan 17 '16 20:01 retep998

I'm interested in this. It's important to reference specific git commits as that's really more secure than signatures, although maybe signatures offer greater convenience. We could check version information agreed with the git tag on that commit, of course. In fact, there is a case for keeping a permanent log of all git commits ids for all dependencies where the crate was distributed.

burdges avatar Jan 19 '16 01:01 burdges

I'm interested in this. It's important to reference specific git commits as that's really more secure than signatures, although maybe signatures offer greater convenience.

You don't need to do all of that. You could simply allow one to specify a SHA-256 (or whatever) digest in their crates.io dependency. Then cargo would verify that the downloaded package has that SHA-256 commit.

If one would prefer to use git identiries for this then one can avoid using crates.io at all, and instead use git submodules or git subtree. Note that git's use of SHA-1 instead of a secure digest function complicates this though.

briansmith avatar Jan 19 '16 01:01 briansmith

It's important to reference specific git commits as that's really more secure than signatures,

crates.io uses both static artifacts and git, so any security features added to crates.io/cargo really need to support both.

FWIW, git uses the SHA1 hash function for integrity, and the vultures are circling. I don't think it will be too long before we see practical collision attacks on SHA1. Git has a few built-in defenses for this, but we haven't really seen them tested very well by attackers. Meanwhile we continue to see collision attacks against MD5 (SLOTH being the most recent)

I would very much recommend using a modern signature algorithm (Ed25519 was recently standardized by the IRTF CFRG) and a modern hash function (Ed25519 uses SHA2-512)

tarcieri avatar Jan 19 '16 01:01 tarcieri

SHA1 produces 160 bits, in other words it is more likely that the whole Rust core team are gets killed by wolfs on new years eve, than a hash collision in SHA1 appears.

ticki avatar Jan 19 '16 06:01 ticki

The number of bits alone doesn't determine the security of the hash function, the algorithm used to calculate them does. From every source I can find the security community agrees that SHA1 is broken, here are just two examples: https://www.schneier.com/blog/archives/2005/02/sha1_broken.html http://arstechnica.com/security/2015/10/sha1-crypto-algorithm-securing-internet-could-break-by-years-end/

faern avatar Jan 19 '16 07:01 faern

SHA1 is broken if you try to exploit it, but getting the same hash is very unlikely.

ticki avatar Jan 19 '16 07:01 ticki

Very true! But I thought this thread was about protecting the build process from exploitation?

faern avatar Jan 19 '16 08:01 faern

@Ticki I was specifically talking about "practical collision attacks on SHA1". These would enable an attacker to do things like create two commits: one benign and one malicious which have the same SHA1 hash. They could get a benign commit into a project, then trick someone into using the malicious one since they share the same commit hash. Exact details of how to do this TBD: git has some built-in defenses. But I expect we'll see attacks of this nature start popping up against git when a practical SHA1 collision attack is published.

tarcieri avatar Jan 19 '16 18:01 tarcieri