rustls icon indicating copy to clipboard operation
rustls copied to clipboard

Add function to verify CertifiedKey consistency

Open lvkv opened this issue 1 year ago • 34 comments

Using the subject_public_key_info() function we added in https://github.com/rustls/webpki/pull/253, this PR:

  • Adds a function to trait SigningKey that returns its SubjectPublicKeyInfo, but only if the implementer opts in
  • Adds a function to CertifiedKey that verifies the consistency of its public and private keys by comparing the SPKI values of its end-entity cert and its key

The motivation behind this is to make it possible to verify consistency for public and private keys (https://github.com/rustls/rustls/issues/1918).

lvkv avatar May 17 '24 01:05 lvkv

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 94.29%. Comparing base (08bc10d) to head (1326247).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1954   +/-   ##
=======================================
  Coverage   94.28%   94.29%           
=======================================
  Files          97       97           
  Lines       21818    21841   +23     
=======================================
+ Hits        20572    20595   +23     
  Misses       1246     1246           

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar May 17 '24 02:05 codecov[bot]

Realized I was reviewing at the same time as Ctz so I submitted my partial review :-) I agree with his feedback.

cpu avatar May 17 '24 13:05 cpu

Some (hopefully helpful) pointers for the failed CI tasks:

cpu avatar May 17 '24 13:05 cpu

I'm converting this to a "ready for review" PR, as it's less of a draft now. Questions:

  • It looks like we've agreed to return Ok(None) by default for SigningKey::public_key. This looks sensible, though it means that certificate_and_key_are_consistent will (as this PR is written) always return Error::InvalidCertificate(CertificateError::BadEncoding), since 1. it interprets the absence of a SPKI as an invalid cert and 2. nobody has overridden the default behavior of certificate_and_key_are_consistent yet. I assume the solution is to update the SigningKey implementers in additional PR(s) before putting this current change in a release?
  • @djc / @cpu suggested that we updated the return type of SigningKey::public_key from Result<Option<Vec<u8>>, Error> to Result<Option<SubjectPublicKeyInfoDer>, Error> over at https://github.com/rustls/rustls/pull/1954#discussion_r1605019692. Does this count as an external type? I'm still poking cargo check-external-types, which I haven't gotten to work locally yet 🙂

Also: I can tidy up the commits like we did over at https://github.com/rustls/webpki/pull/253. I'll wait for reviews to settle first, though.

lvkv avatar May 21 '24 02:05 lvkv

It looks like we've agreed to return Ok(None) by default for SigningKey::public_key. This looks sensible, though it means that certificate_and_key_are_consistent will (as this PR is written) always return Error::InvalidCertificate(CertificateError::BadEncoding), since 1. it interprets the absence of a SPKI as an invalid cert and 2. nobody has overridden the default behavior of certificate_and_key_are_consistent yet. I assume the solution is to update the SigningKey implementers in additional PR(s) before putting this current change in a release?

Hmm, yeah, it does seem a little off that if SigningKey::public_key's default impl isn't overridden the new API produces a BadEncoding error. I could see that being confusing. I suggested the bad encoding error thinking it was used only when the SPKI was missing from a cert, but didn't think about the case where the SigningKey wasn't able to provide the information.

Perhaps it would be better to formulate certificate_and_key_are_consistent() (or whatever its name ends up being) to be infallible. Instead of returning a Result<bool>, maybe we could return an enum with variants like Consistent, Inconsistent, Unknown. We'd return Unknown if the SPKI can't be produced for either the key or the cert, Inconsistent if we compared SPKIs and they weren't byte-for-byte identical, and Consistent if they were.

My thinking is that even when we implement public_key() for the aws-lc-rs and ring signing keys there is still the possibility of external implementations of the trait not having caught up yet. I think treating that as an error instead of a variant like Unknown might be too heavy-weight given I see this more as a helper API than load-bearing. WDYT?

cpu avatar May 21 '24 14:05 cpu

Perhaps it would be better to formulate certificate_and_key_are_consistent() (or whatever its name ends up being) to be infallible. Instead of returning a Result, maybe we could return an enum with variants like Consistent, Inconsistent, Unknown. We'd return Unknown if the SPKI can't be produced for either the key or the cert, Inconsistent if we compared SPKIs and they weren't byte-for-byte identical, and Consistent if they were.

@cpu, I took a stab at rewriting the keys_match() function with this in mind. The "unknown" in particular feels a bit goofy to me, like Rustls is a magic 8 ball that wants me to "ask again later" (i.e. when we've gone around and implemented public_key on all the SigningKeys) 🙂

As a dev, I'm unsure if I'd ever handle Unknown differently from Inconsistent differently from Error. I basically only ever want Consistent, and anything else indicates misconfiguration—resulting in either a fast failure, alert or both for whatever system I'm developing.

My thinking is that even when we implement public_key() for the aws-lc-rs and ring signing keys there is still the possibility of external implementations of the trait not having caught up yet. I think treating that as an error instead of a variant like Unknown might be too heavy-weight given I see this more as a helper API than load-bearing. WDYT?

I think the design should reflect whether or not this check will happen by default during the creation of a CertifiedKey (going off of https://github.com/rustls/rustls/issues/1918#issuecomment-2097111280). Open to opinions!

lvkv avatar May 22 '24 04:05 lvkv

@cpu, I took a stab at rewriting the keys_match() function with this in mind

Thanks! That looks close to what I was thinking, but I was imagining it wouldn't be a Result<KeyConsistency, Error> return, just KeyConsistency.

The "unknown" in particular feels a bit goofy to me, like Rustls is a magic 8 ball that wants me to "ask again later" (i.e. when we've gone around and implemented public_key on all the SigningKeys) As a dev, I'm unsure if I'd ever handle Unknown differently from Inconsistent differently from Error. I basically only ever want Consistent, and anything else indicates misconfiguration—resulting in either a fast failure, alert or both for whatever system I'm developing.

Yeah :-/ It is awkward throwing a third variant in the mix.

If folks don't think my suggestion is a net improvement I'm not tied to it and open to alternatives. My main concern was avoiding a bad encoding error for the case where SigningKey::public_key() returns Ok(None). If that's fixed another way I'm also satisfied.

Maybe if the principle call-site we care about using that fn is going to turn it into an error anyway the default impl should be Err(General("unimplemented".into())) (bikeshedding the details as appropriate) instead of Ok(None).

cpu avatar May 22 '24 13:05 cpu

From #1918:

Ideally Rustls would extract the SPKI from the EE certificate and then ask the crypto provider to do a pairwise consistency check as part of the construction of a CertifiedKey.

Is this actually what we're doing? Currently this PR actually just does a byte-for-byte comparison of the SPKI data, right?

djc avatar May 22 '24 13:05 djc

Perhaps it would be better to formulate certificate_and_key_are_consistent() (or whatever its name ends up being) to be infallible. Instead of returning a Result<bool>, maybe we could return an enum with variants like Consistent, Inconsistent, Unknown. We'd return Unknown if the SPKI can't be produced for either the key or the cert, Inconsistent if we compared SPKIs and they weren't byte-for-byte identical, and Consistent if they were.

I think the core question is the intended behavior for crypto providers that don't implement this. I think in rustls 0.24 we might want to tighten this up to force crypto provider implementers to support this? If that's the case, I think probably it'd make sense to just yield Result<(), Error> and have the default impl yield Ok(()) for now.

djc avatar May 22 '24 13:05 djc

Is this actually what we're doing?

Is that your preference? Without strong signal it seems better to follow the plan outlined by Ctz unless there's consensus it isn't the right approach. When that comment was posted I didn't understand the advantages (I'm still not sure I do with crystal clarity). I asked in the Discord thread and Ctz said:

yes, i think he's saying that crypto libraries asked for the public half of a key pair could fulfil the request from a public key they read in (alongside the private key; for example RSA keys in PKCS#1 format) rather than one they computed. i think that seems like a harder situation for a user to get in though, and likely a different problem

cpu avatar May 22 '24 13:05 cpu

Is this actually what we're doing?

Is that your preference? Without strong signal it seems better to follow the plan outlined by Ctz unless there's consensus it isn't the right approach. When that comment was posted I didn't understand the advantages (I'm still not sure I do with crystal clarity).

Ahh, sorry for potentially derailing this discussion from an agreed upon plan. To be clear, doing this is definitely better than doing nothing. I just think "doing SPKI bytes comparison" is probably a weaker guarantee than if we could ask the crypto provider to do curve/group/whatever magic math thing to check that the keys are actually compatible.

djc avatar May 22 '24 13:05 djc

My main concern was avoiding a bad encoding error for the case where SigningKey::public_key() returns Ok(None). If that's fixed another way I'm also satisfied.

Agree. @cpu How do you guys feel about the following options to avoid this? Ordered from least to most drastic:

  1. Option 1: Change the default implementation of SigningKey::public_key to something along the lines of err(Error::Unimplemented). This would allow us to disambiguate None from unimplemented when implementing CertifiedKey::keys_match and, in turn, avoid us mistaking "unimplemented" for "bad encoding".

  2. Option 2: Remove the default implementation of SigningKey::public_key altogether and frontload the work of implementing it for types that impl SigningKey. This avoids the "unimplemented" case altogether.

  3. Option 3: All of Option 2, plus removing the Option from the return type of SigningKey::public_key, taking it from Result<Option<SubjectPublicKeyInfoDer>, Error> to Result<SubjectPublicKeyInfoDer, Error>. This would eliminate both the "unimplemented" problem, but also the "unknown" case when one of the SPKIs is None.

Also open to suggestions!

lvkv avatar May 27 '24 22:05 lvkv

@lvkv From my perspective I think Option 1 is probably the most agreeable. Option 2 and 3 will be breaking API changes to the exported SigningKey trait and we're trying to avoid those while the ecosystem is catching up w/ our last release.

cpu avatar May 29 '24 20:05 cpu

Option 1 sounds good, maybe we should file an issue with a label to remind ourselves to remove the default impl for the next semver-breaking release?

djc avatar May 30 '24 08:05 djc

I was thinking originally we could call the new consistency API CertifiedKey::keys_match in several places that accept CertifiedKeys and are already fallible. There are only a few such places, but doing that unprompted requires either a) we don't treat missing SigningKey::public_key as an error, or b) require SigningKey::public_key as a fundamental part of that trait.

(a) means we can add those calls, and land this PR before writing the wedge of encoding conversion code that would be needed for the impls of SigningKey just in this crate.

(b) I think is a tough sell, needing a lot of research. For example, can you get a public key given a windows NCRYPT_KEY_HANDLE? I have no idea, but it seems not, and the suggestion is to look up the certificate and extract the public key from that (which would be quite circular...)

The other option, of course, is that we don't insert those calls, and leave the API for others to call when they know or expect their SigningKey to support getting the public key.

ctz avatar May 30 '24 09:05 ctz

Right, so I think this probably is the right direction still:

I think the core question is the intended behavior for crypto providers that don't implement this. I think in rustls 0.24 we might want to tighten this up to force crypto provider implementers to support this? If that's the case, I think probably it'd make sense to just yield Result<(), Error> and have the default impl yield Ok(()) for now.

djc avatar May 30 '24 09:05 djc

@lvkv Apologies, I think we lost some of the earlier momentum we had going here. Have you become busy (very reasonable!) or are there remaining points of uncertainty we should try and hash out to unblock progress?

cpu avatar Jun 12 '24 18:06 cpu

Thanks for checking in. I'll be much more available after a few days! (Apartment hunting + moving in NYC)

lvkv avatar Jun 12 '24 19:06 lvkv

Coming back to this now! It's not clear to me, based on the discussion above, that there's a consensus yet on what the API for this should look like and why 🙂.

One idea to unstick the conversation: how do we feel about merging what we have right now (after cleanup)? I've also just added a commit to bring code coverage up to 100%. The plan would be:

  • Clean up and land this PR with its current API
  • Begin working on public_key implementations
  • (Optional) Revise the API as needed for the next minor release

Edit: typo

lvkv avatar Jun 15 '24 00:06 lvkv

Also—if there actually is a design consensus and I'm just oblivious to it, do let me know 🙂

lvkv avatar Jun 15 '24 00:06 lvkv

One idea to unstick the conversation: how do we feel about merging what we have right now (after cleanup)?

That sounds OK to me :+1:

if there actually is a design consensus and I'm just oblivious to it, do let me know

Let me try and summarize some things I think we have agreement on (of course this is an invitation to please chime in if I'm misrepresenting anything).

  • It would be a stronger guarantee to make the CryptoProvider handle checking consistency, instead of looking at SPKI. However, doing that in a backwards compatible way is more challenging, and in practice SPKI matching will be fine in the majority of cases. Consensus: continue with SPKI approach.
  • We don't want to require existing SigningKey impls update to provide a new mandatory fn for exposing the public key/SPKI - it would be backwards incompatible, and in some cases (e.g. rustls-cng) it might not be possible to implement. Consensus: we need a default implementation, and it should be possible for an impl to opt-out.
  • We want to check CertifiedKey consistency internally in the fallible places we accept one as input. Because of ^ we have to be prepared for a case where we can't determine consistency. Consensus: when we try to do consistency matching internally we should not hard-fail if the SPKI can't be determined.

With the above in mind it feels like this branch is pretty close to matching the consensus. I think SigningKey::public_key() could be changed to return Option<SubjectPublicKeyInfoDer> instead of Result<Option<SubjectPublicKeyInfoDer>> - it doesn't seem like there's a use-case for surfacing a failure to produce the SPKI separate from just yielding None.

When we call CertifiedKey::keys_match() internally we'd treat KeyConsistency::Inconsistent as an error to surface immediately to the caller, and we'd ignore KeyConsistency::Unknown so that default/missing impls of SigningKey::public_key() are a non-failure.

I think it'd be best to land the internal calls to CertifiedKeys::keys_match() alongside this work so that we don't end up with any dead code, but the updates to implement public_key() for the crate internal certified keys could be follow-up.

@ctz @djc How do you feel about the above?

cpu avatar Jun 17 '24 15:06 cpu

Let me try and summarize some things I think we have agreement on (of course this is an invitation to please chime in if I'm misrepresenting anything).

Thanks for writing this down, and that does match where I think we are.

With the above in mind it feels like this branch is pretty close to matching the consensus. I think SigningKey::public_key() could be changed to return Option<SubjectPublicKeyInfoDer> instead of Result<Option<SubjectPublicKeyInfoDer>> - it doesn't seem like there's a use-case for surfacing a failure to produce the SPKI separate from just yielding None.

Agree on this.

I think it'd be best to land the internal calls to CertifiedKeys::keys_match alongside this work so that we don't end up with any dead code, but the updates toe implement public_key() for the crate internal certified keys could be follow-up.

I could take it or leave it. As this PR currently stands there is no dead code, since the new public API surface calls everything (and there are no uncovered lines because the tests exercise everything).

ctz avatar Jun 17 '24 15:06 ctz

@ctz @djc How do you feel about the above?

I mostly agree, the one thing I'm in doubt about is whether the KeyConsistency type pulls its weight in this setup compared to just using Result<(), Error> or maybe even just bool.

djc avatar Jun 17 '24 16:06 djc

I mostly agree, the one thing I'm in doubt about is whether the KeyConsistency type pulls its weight in this setup compared to just using Result<(), Error> or maybe even just bool.

Using CertifiedKey::keys_match(&self) -> bool seems unappealing because if the fn returns false in one of the internal call-sites we would have to proceed without error not knowing whether it was false because the SPKI info was available and didn't match, or false because it wasn't available.

Using CertifiedKey::keys_match(&self) -> Result<(), Error> seems OK to me, but to avoid the situation above we'd need distinct Error results like Error::KeysInconsistent and Error:SpkiUnavailable (bikeshedding names tbd). We'd only be treating one of those two variants as an error internally, so it feels a little bit odd to express both as Error variants.

I don't feel strongly. @ctz do you want to be a tie breaker?

cpu avatar Jun 17 '24 16:06 cpu

I don't feel strongly. @ctz do you want to be a tie breaker?

Maybe I can get there by enumerating the things I don't think work, lol:

  • keys_match(&self) -> Result<KeyConsistency, Error> (current PR state) begs me as a caller to say keys_match().unwrap() but that misses the Ok(KeyConsistency::Inconsistent) case I wanted to actually check for, oops.
  • keys_match(&self) -> bool - agree that this collapses too many error cases to false

I think that leaves us with keys_match(&self) -> Result<(), Error>:

Generally I would prefer not to add very specific cases to the top of rustls::Error, or add further uses of Error::General. Reusing CertificateError comes with the baggage that we say it only comes from the certificate verification traits, and that each one has a TLS alert that it maps down to.

I suggest we add something like:

pub enum Error {
(...)
    /// Specific failure cases from [`CertifiedKey::keys_match`]
    InconsistentKeys(InconsistentKeys),
(...)
}

/// Specific failure cases from [`CertifiedKey::keys_match`]
#[non_exhaustive]
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub enum InconsistentKeys {
    /// The public key returned by the [`SigningKey`] does not match that in the certificate.
    SigningKeyAndCertificateMismatch,

    /// The [`SigningKey`] cannot produce its corresponding public key.
    SigningKeyCannotProducePublicKey,
}

That is very wordy, so please feel free to wordsmith as desired.

ctz avatar Jun 17 '24 17:06 ctz

I suggest we add something like

That proposal sounds good to me. With reflection I think @djc's instincts are correct here & the consistent variant in the KeyConsistency enum doesn't hold its weight. When you remove that variant all that's left are the two error cases and I think you've made a good case for why they should be tucked behind an Error::InconsistentKeys variant as a distinct error enum.

So to summarize:

  • SigningKey::public_key() should be updated to return Option<SubjectPublicKeyInfoDer>, with the default impl being None
  • KeyConsistency enum gets removed.
  • We add a InconsistentKeys enum to error.rs with variants for "definitely mismatched" and "not sure"
  • Error gets a new Error::InconsistentKeys variant holding a InconsistentKeys.
  • CertifiedKey::keys_match(&self) should be updated to return Result<(), Error>, yielding Ok(()) for the case where things check out, and Err(Error::InconstentKeys(...)) otherwise. The specific InconstentKeys variant to use will depend on whether public_key() returned None, or a value that didn't match the cert's SPKI.
  • In the Rustls-internal callsites where we take a CertifiedKey instance, we call keys_match(), matching on the Err() result to decide whether to eat the error (for the "not sure" case to preserve back-compat) or to pass it forward (for the "definitely mismatched" case).

cpu avatar Jun 17 '24 17:06 cpu

SigningKey::public_key() should be updated to return Option<SubjectPublicKeyInfoDer>, with the default impl being None

KeyConsistency enum gets removed.

We add a InconsistentKeys enum to error.rs with variants for "definitely mismatched" and "not sure"

Error gets a new Error::InconsistentKeys variant holding a InconsistentKeys.

CertifiedKey::keys_match(&self) should be updated to return Result<(), Error>, yielding Ok(()) for the case where things check out, and Err(Error::InconstentKeys(...)) otherwise. The specific InconstentKeys variant to use will depend on whether public_key() returned None, or a value that didn't match the cert's SPKI.

Done. Feel free to nit!

In the Rustls-internal callsites where we take a CertifiedKey instance, we call keys_match()

I'll try adding this in too. In the meantime, are there any super-important call sites in particular you have in mind for this? We can also turn this bullet into a separate PR if you'd like.

lvkv avatar Jun 17 '24 23:06 lvkv

Is there a meaningful/not contrived way to add test coverage for the Display of this new error variant?

lvkv avatar Jun 19 '24 02:06 lvkv

Is there a meaningful/not contrived way to add test coverage for the Display of this new error variant?

Add a line into this test: https://github.com/rustls/rustls/blob/3bb255cd58cda65f3e8a998f71df03429f03941c/rustls/src/error.rs#L724

ctz avatar Jun 19 '24 12:06 ctz

Oh boy, I seem to have done the wrong thing with Git here 😓 Edit: Disaster averted

lvkv avatar Jun 19 '24 16:06 lvkv