fleet
fleet copied to clipboard
Automatically renew host SCEP certificates before expiration
Goal
User story |
---|
As an IT admin, |
I want Fleet to automatically renew the SCEP certificates installed on my hosts |
so that my SCEP certificates never expire I don't have to turn on MDM again for macOS hosts. |
Changes
Product
- [ ] Other changes:
- 30 days before a host's SCEP cert expires, Fleet sends an
InstallProfile
command with an enrollment profile. This causes the SCEP certificate to be renewed. - If renewal fails, Fleet logs an error and tries again the next day.
- 30 days before a host's SCEP cert expires, Fleet sends an
Engineering
Context: the Fleet server acts as a CA and delivers issues certificates to devices during MDM enrollment using the SCEP protocol.
The certificate issued to the device has a validity period defined via the mdm.apple_scep_signer_validity_days server config.
After the certificate expires, the server is not able to authenticate the client anymore. See this if you're interested in the details.
[!WARNING] presenting two possible options, we need to choose one before implementing.
Option A: middleware (click to expand)
- [ ] Add a middleware here, that is called after
httpmdm.CertExtractMdmSignatureMiddleware
httpmdm.CertVerifyMiddleware
https://github.com/fleetdm/fleet/blob/cf7b2e9903477a4522602a95594aabf76a67fada/server/service/handler.go#L992-L995 - [ ] In the middleware, you have access to the certificate from the context: https://github.com/fleetdm/fleet/blob/cf7b2e9903477a4522602a95594aabf76a67fada/server/mdm/nanomdm/http/mdm/mdm_cert.go#L101-L106
- [ ] If the certificate expires in 30 days, send an
InstallProfile
command with an enrollment profile generated byapple_mdm.GenerateEnrollmentProfileMobileconfig
. Consider if it's worth to add a special method toapple_mdm.Commander
for the enrollment profile, or just usingCommander.InstallProfile
is good enough. - [ ] Database schema migrations: Not required
- [ ] Load testing: Not required
Option B: cron job (click to expand)
- [ ] In a cron job, look at certificates that expire in 30 days using the
scep_certificates
table. - [ ] For each cert, calculate its checksum like this: https://github.com/fleetdm/fleet/blob/cf7b2e9903477a4522602a95594aabf76a67fada/server/mdm/nanomdm/service/certauth/certauth.go#L100-L105
- [ ] Look for matching hosts using
nano_cert_auth_associations
- [ ]
nano_cert_auth_associations.id
is the UUID of the host (hosts.uuid
) - [ ]
nano_cert_auth_associations.sha256
is the checksum of the cert
- [ ]
- [ ] Send an
InstallProfile
command with an enrollment profile generated byapple_mdm.GenerateEnrollmentProfileMobileconfig
. Consider if it's worth to add a special method toapple_mdm.Commander
for the enrollment profile, or just usingCommander.InstallProfile
is good enough. - [ ] Consider if adding indexes and/or pre-computing the sha256 of each certificate might be desirable
- [ ] Database schema migrations: To be defined
- [ ] Load testing: Not required
ℹ️ Please read this issue carefully and understand it. Pay special attention to UI wireframes, especially "dev notes".
Context
- Requestor(s): _________________________
QA
Risk assessment
- Requires load testing: No
- Risk level: Low / High: Low
Manual testing steps
- Configure a value > 30 days for mdm.apple_scep_signer_validity_days when you start your server.
- Turn on MDM features for a new macOS host.
- Trigger the
cleanups_then_aggregation
job, which should enqueue a cert renewal - Verify that the cert is renewed. You can do this by searching for the "Fleet Identity" certificate in Keychain
- As long as
mdm.apple_scep_signer_validity_days
is > 30, we'll renew the cert on each cron run. To stop this process, restart the server without the setting set (defaults to 1 year), run the cron again, and verify that the cert issued is for 1 year.
Additional testing
- Do all the steps above, but this time:
- Enable MDM SSO
- Enroll a host via ADE
- After a renewal, check if the enrollment profile still has a query parameter named
enrollment_reference
Testing notes
Confirmation
- [ ] Engineer (@____): Added comment to user story confirming successful completion of QA.
- [ ] QA (@____): Added comment to user story confirming successful completion of QA.
Before the certificate expires, automatically issue an InstallProfile command with an enrollment profile
@roperzh just curious, what happens if the SCEP cert is already expired? Can we not renew the cert w/ an InstallProfile command?
@roperzh just curious, what happens if the SCEP cert is already expired? Can we not renew the cert w/ an InstallProfile command?
@noahtalerman we can't because we can't send any MDM commands at all! (as we're not able to authenticate the device)
@marko-lisica heads up, I think we want to prioritize this story above #16335 and #11544.
Why? I just realized we have two MDM customers that participated in a beta for macOS MDM features (started in March 2023): customer-zabinski
and customer-clara
.
customer-zabinski
turned on MDM features for some hosts on 2023-03-07
customer-clara
turned on MDM features for some hosts on 2023-03-01
I think this means the SCEP certs for these hosts will expire on 2023-03-07 and 2023-03-01 respectively. @roperzh does that sound right to you?
So, I think we're going to want to ship this as part of the first patch release next sprint (Fleet v.4.45.1) which falls on 2024-02-26.
I think this means the SCEP certs for these hosts will expire on 2023-03-07 and 2023-03-01 respectively. @roperzh does that sound right to you?
@noahtalerman the exact date will depend on when the first host turned on MDM features, but yeah, that sounds correct.
@roperzh FYI I moved the original issue description here:
Expected behavior: Fleet automatically attempts to renew SCEP cert 30 days before expiration. If renewal fails, Fleet logs an error and tries again the next day.
Problem
As part of the SCEP protocol, each device owns a certificate that's used for authentication. These certificates have a default expiration date of one year (can be configured using this setting)
Hosts with expired certificates can't communicate with the MDM server.
For more context, see https://github.com/fleetdm/confidential/issues/4518
Potential solutions
- Before the certificate expires, automatically issue an
InstallProfile
command with an enrollment profile. This causes the SCEP certificate to be renewed.
Here's the separate story for adding an activity item for SCEP cert renewal: https://github.com/fleetdm/fleet/issues/16671
@georgekarrv heads up, moving this story to "Settled."
We want to ship this as part of the 4.45.1 patch (targeted on 2024-02-26) so that our earliest adopters don't have certs that expire. More context here.
This means we'll give the story a lightspeed label and make an exception to release a feature during a patch.
cc @lukeheath
@noahtalerman
This means we'll give the story a lightspeed label and make an exception to release a feature during a patch.
We can't release features as part of a patch release because it violates semantic versioning:
"MAJOR version when you make incompatible API changes MINOR version when you add functionality in a backward compatible manner PATCH version when you make backward compatible bug fixes"
The distinction comes down to whether this fixes a defect or adds functionality.
If it fixes a defect, we should re-label it as a bug, and we can release it as part of the patch.
If it adds functionality, we need to issue a new minor version. We don't have to wait until the next scheduled minor release, but we would need to introduce an unscheduled minor version bump.
The next scheduled minor version release is v4.45.0
on 02/19, which is well before the first expiration on 03/01, so sticking to our normal schedule seems like the best course.
Oh, I see, this isn't coming into the sprint until v4.45.0
so that doesn't work.
In that case, if needed we'll have to release a mid-sprint minor release. Alternatively, if we host the environments we could look into renewing the SCEPs manually.
Re semver: makes sense. Thanks Luke.
if needed we'll have to release a mid-sprint minor release. Alternatively, if we host the environments we could look into renewing the SCEPs manually.
I think we're hosting customer-clara
. I think customer-zabinski
is self-hosted.
I think releasing a mid-sprint minor release would be a better experience for both customers.
We only have to notify these customers because they're the only customers approaching cert expiration.
I scheduled a call for you, I, @pintomi1989, and @Patagonia121 to align/discuss.
Sounds good @noahtalerman and @lukeheath - I think I understand the ask and I already posted in the appropriate channels. I'll confirm with @pintomi1989 tomorrow. We're happy to hop on a call to go through any additional context, if any. Thanks!
Hey team! Please add your planning poker estimate with Zenhub @ghernandez345 @gillespi314 @mna @roperzh
Completed manual testing steps with both manual and automatic enrollment.
@roperzh did we go with option A (middleware) or option B (cron job)?
Renewal automatic, In cloud city, no panic, Fleet's magic, no static.