lemur icon indicating copy to clipboard operation
lemur copied to clipboard

Probable issue(s) in source syncing

Open explody opened this issue 6 years ago • 1 comments

We weren't sure exactly which parts of this are problems vs intentional so it's mostly questions to start, but we're up front willing to provide PRs if there's agreement that some of this needs changing.

Full disclosure, we're partially responsible for some of this, via prior bug fixes and PRs, before noticing the possible problems.

I'll start here

https://github.com/Netflix/lemur/blob/b83c5eca49756401f150159a6da4b70a7a8d23c7/lemur/sources/service.py#L119-L122

This conditional looks first for a certificate matching the certificate name (presumably the cert CN). But, given that Lemur naturally mangles the name (e.g. "CN-issuer-notBefore-notAfter-[possible hex serial if there's a name conflict]), this check will never match unless the source itself is pre-mangling the names to the Lemur format (we did this, it was a mistake and we undid it), which is incredibly unlikely.

Proposed fix

Either pre-mangle the certificate name so we'd be searching for a name that may exist, or scrap the check-by-name entirely and use more reliable check(s)

After the name check, comes a more concerning issue

https://github.com/Netflix/lemur/blob/b83c5eca49756401f150159a6da4b70a7a8d23c7/lemur/sources/service.py#L124-L129

The get_by_serial queries are done as if serial number were unique, which even between separate Sources/Authorities which it definitely is not (not unique in the DB either, whereas 'name' is). Any given source(s) could start the serial at 1 (openssl does) and as written, exists could end up with certs from completely different sources, just based on overlapping serial numbers.

Under what use case would checking for existing cert legitimately return multiple certs?

Proposed fix

When checking for preexisting certs, they should be checked with a combo of issuer and serial number, which would be a reasonably unique combo. Or, perhaps a fingerprint field for certs so they can be universally uniquely identified.

This leads to these next snips, where the possibility of multiple exists is handled, but as the question above, it's confusing why.

Here, potentially for multiple certs in exists, which may be from other sources, the external_id and authority_id are updated to match the one current cert being processed in the loop.

This would overwrite fields on these certs, without even checking if they're from the same source.

Proposed fix

IMHO, a single cert from a source should not have multiple entries in Lemur (should it?). The "exists" checks should be sufficiently constrained that so there are not multiples - or at minimum, really confirm that the entries are all the same cert if there's a use case I'm missing.

https://github.com/Netflix/lemur/blob/b83c5eca49756401f150159a6da4b70a7a8d23c7/lemur/sources/service.py#L137-L148

Which lastly leads to certificate_update(), in which, if the current source is not in the certificate.sources, it is appended to the list - though the association between these was based almost exclusively on finding the cert by serial number, which again is not unique

This may be the most confusing part - the support of multiple sources for one cert. How would this happen? In what scenario would a single cert be (and need to be) from multiple sources?

No proposal here - this seems intentional, I just don't grok it.

https://github.com/Netflix/lemur/blob/b83c5eca49756401f150159a6da4b70a7a8d23c7/lemur/sources/service.py#L42-L50

explody avatar Oct 12 '18 01:10 explody

Hey all good questions, as with most things the sync process as evolved over time, so I'm not sure there is a real though out strategy here.

What makes things worse is that we generally don't see than many new certificates found by this process internally, as we generally expect all of our users use Lemur directly to manage their certificates. This negligence is probably contributes to many of the issues you are seeing.

We weren't sure exactly which parts of this are problems vs intentional so it's mostly questions to start, but we're up front willing to provide PRs if there's agreement that some of this needs changing.

Full disclosure, we're partially responsible for some of this, via prior bug fixes and PRs, before noticing the possible problems.

I'll start here

https://github.com/Netflix/lemur/blob/b83c5eca49756401f150159a6da4b70a7a8d23c7/lemur/sources/service.py#L119-L122

This conditional looks first for a certificate matching the certificate name (presumably the cert CN). But, given that Lemur naturally mangles the name (e.g. "CN-issuer-notBefore-notAfter-[possible hex serial if there's a name conflict]), this check will never match unless the source itself is pre-mangling the names to the Lemur format (we did this, it was a mistake and we undid it), which is incredibly unlikely.

Proposed fix Either pre-mangle the certificate name so we'd be searching for a name that may exist, or scrap the check-by-name entirely and use more reliable check(s)

This is an artifact of the fact that our first source was AWS. The AWS name for a certificate is essentially the pre-mangling you purposed. I think it's reasonable to pre-mangle upon intake such that we have a reasonable baseline for all sources not just AWS. The only concern I'd have is that it could cause some duplication of certificates that already exist. Perhaps we could place pre-mangling behind a flag and document it as such?

After the name check, comes a more concerning issue

https://github.com/Netflix/lemur/blob/b83c5eca49756401f150159a6da4b70a7a8d23c7/lemur/sources/service.py#L124-L129

The get_by_serial queries are done as if serial number were unique, which even between separate Sources/Authorities which it definitely is not (not unique in the DB either, whereas 'name' is). Any given source(s) could start the serial at 1 (openssl does) and as written, exists could end up with certs from completely different sources, just based on overlapping serial numbers.

Under what use case would checking for existing cert legitimately return multiple certs?

Proposed fix When checking for preexisting certs, they should be checked with a combo of issuer and serial number, which would be a reasonably unique combo. Or, perhaps a fingerprint field for certs so they can be universally uniquely identified.

I think including issuer here makes sense and wouldn't have a problem adding that. There appears to be a standard fingerprinting algorithm that sounds like it would be the more complete fix, but would potentially require an alembic task to fingerprint all existing certificates:

https://knowledge.digicert.com/solution/SO4264.html

Looks like we may need to compute both the md5 and sha1 hash, or perhaps just pick one? (sha1 would be my preference)

This leads to these next snips, where the possibility of multiple exists is handled, but as the question above, it's confusing why.

Here, potentially for multiple certs in exists, which may be from other sources, the external_id and authority_id are updated to match the one current cert being processed in the loop.

This would overwrite fields on these certs, without even checking if they're from the same source.

Proposed fix IMHO, a single cert from a source should not have multiple entries in Lemur (should it?). The "exists" checks should be sufficiently constrained that so there are not multiples - or at minimum, really confirm that the entries are all the same cert if there's a use case I'm missing.

https://github.com/Netflix/lemur/blob/b83c5eca49756401f150159a6da4b70a7a8d23c7/lemur/sources/service.py#L137-L148

Generally speaking I agree that one certificate should be linked to multiple sources if it is truly the same. The one historical use case that we've had is that a certificate can be referred to by multiple names. In the past we've dealt with this by simply uploading the same certificate specifying a different name, even though the certificates would have same fingerprint. If I were to start over I would add an aliases column ensuring that only one of each certificate exists (constraint on fingerprint?) which would solve a lot of the problems mentioned.

Which lastly leads to certificate_update(), in which, if the current source is not in the certificate.sources, it is appended to the list - though the association between these was based almost exclusively on finding the cert by serial number, which again is not unique

This may be the most confusing part - the support of multiple sources for one cert. How would this happen? In what scenario would a single cert be (and need to be) from multiple sources?

No proposal here - this seems intentional, I just don't grok it.

https://github.com/Netflix/lemur/blob/b83c5eca49756401f150159a6da4b70a7a8d23c7/lemur/sources/service.py#L42-L50

Multiple sources must be supported for, again, the AWS use case of the certificate existing in multiple aws accounts. Now, you could make the argument that at that point we shouldn't really be associating sources at all and instead associate destinations.

In summary:

  • We should add a fingerprint column, such that we actually can actually determine what is unique.
    • Add alembic migration to backfill existing fingerprints
    • Ensure that don't just fingerprint the certificate itself but also it's chain. Often the certificates chain makes it unique.
  • Once unique is actually foundational we can remove a lot of cruft that syncing attempts to do with it's psuedo-unique values.
  • Explore adding an aliases field and de-duplicating based on fingerprint
    • Be careful to still support query/accessing by name as a lot of automation relies on querying a certificate by it's name (i.e. ?q=name should equal ?q=name&&alias)
  • If we decided to fingerprint the mangling vs pre mangling should be solved by relying on the fingerprint instead of the name. Some logic might need to be added to add the name as an alias to the existing certificate (different sources could have different names for a given certificate). I'm not sure if we really care that we maintain the mapping between source and name/alias.

kevgliss avatar Oct 15 '18 20:10 kevgliss