fleet icon indicating copy to clipboard operation
fleet copied to clipboard

Escrow Buddy doesn't work after Apple MDM is turned off and then turned back on

Open spalmesano0 opened this issue 3 months ago • 17 comments

Fleet version: 4.72.0

Web browser and operating system: N/A


💥  Actual behavior

customer-mozartia reported an encryption issue on a macOS host:

Today, I worked with a user who experienced an issue with disk encryption on their device. FileVault was enabled, but the verification status remained in Pending.

  • 11 days had passed since the device enrollment.
  • The user restarted the device multiple times, attempting to clear the banner on the My Device page and trigger encryption verification.
  • I reviewed the logs from the device, which showed that Escrow Buddy attempted several times but failed.
  • We re-enrolled the device, and encryption is now working as expected.

After reviewing the logs, it appears that Escrow Buddy was not properly installed.

2025-08-18T12:32:18+01:00 ERR running config receivers error="failed to re-enable Escrow Buddy in the authorization database, err: exit status 127: sh: /Library/Security/SecurityAgentPlugins/Escrow Buddy.bundle/Contents/Resources/AuthDBSetup.sh: No such file or directory\n"

🛠️ To fix

When MDM is turned off, and turned back on, Fleet should adjust FileVault profiles and redeploy them so they work with new SCEP CA.

Make sure if a fleet instance already has a bad filevault profile and turns MDM on it gets updated/recreated.

If needed we can also delete on the turn-off of MDM

Product designer: @marko-lisica

🧑‍💻  Steps to reproduce

  1. Enable disk encryption in Fleet, and make sure keys are escrowed.
  2. Turn off Apple MDM globally, in settings
  3. Turn it back on
  4. See that disk encryption key is not escrowed

spalmesano0 avatar Sep 16 '25 20:09 spalmesano0

QA Notes:

I was able to reproduce with the following steps:

  1. Manually encrypt host
  2. Move to Team with Encryption ON
  3. Confirm banner appears and confirm escrow buddy is present
ls -la "/Library/Security/SecurityAgentPlugins/Escrow Buddy.bundle/Contents/Resources/AuthDBSetup.sh"                                   06:47:58 PM

-rwxr-xr-x  1 root  wheel  2825 Jun 11  2023 /Library/Security/SecurityAgentPlugins/Escrow Buddy.bundle/Contents/Resources/AuthDBSetup.sh
  1. Delete EscrowBuddy

sudo rm "/Library/Security/SecurityAgentPlugins/Escrow Buddy.bundle/Contents/Resources/AuthDBSetup.sh"

  1. Observe Disk Encryption banner persists after several logoffs or restarts
  2. Check Orbit logs and observe error - error="failed to re-enable Escrow Buddy in the authorization database, err: exit status 127: sh: /Library/Security/SecurityAgentPlugins/Escrow Buddy.bundle/Contents/Resources/AuthDBSetup.sh: No such file or directory"

Alternatively you could delete the key from the host_disk_encryption_keys table to get into a similar state

PezHub avatar Sep 19 '25 04:09 PezHub

@lukeheath Adding P1 as customer-fiorella is having an issue with FileVault PRK escrow on 100+ hosts that looks very similar to this issue. Slack thread for context. cc @ddribeiro

spalmesano0 avatar Oct 14 '25 21:10 spalmesano0

@spalmesano0 Thanks, since this seems to be blocking the encryption workflow a P1 makes sense to me. cc @georgekarrv

lukeheath avatar Oct 15 '25 15:10 lukeheath

@spalmesano0 do we know what OS version this was observed on? And was it observed right after an OS upgrade? We are wondering if this is related to or limited to Tahoe

JordanMontgomery avatar Oct 15 '25 16:10 JordanMontgomery

@JordanMontgomery it's not Tahoe related as far as I can tell: logs are from August 2025 and earlier. I don't think this was after an OS upgrade, but it was after enrollment.

spalmesano0 avatar Oct 15 '25 16:10 spalmesano0

After doing some digging with @JordanMontgomery, we don't think it's a corrupted escrow buddy installation, that is the error at play here.

Based on logs provided from the customers, we can see that escrow buddy failed one time, and then successfully updated and recovered (which is the way it's supposed to work), so we switched direction and will investigate further in terms of the key being decrypt-able or not, and run through server logs, some tables to verify states.

MagnusHJensen avatar Oct 17 '25 14:10 MagnusHJensen

After a couple hours of DB and code spelunking I can confirm the issue being experienced by customer-fiorella is unrelated to Escrow Buddy but is in fact a sort of strange interaction with how our Filevault key escrow is implemented when combined with how our overall MDM enablement/disablement works.

Some background on how our filevault key escrow works: When Disk Encryption is enabled for a team and Apple MDM is enabled, Fleet creates a filevault profile including, among other things, the public key of Fleet's Apple MDM SCEP CA. This public key is used to encrypt the recovery keys so that only the fleet server can decrypt them. Everything works well here as long as the Fleet server maintains the same private key.

When an admin turns off Apple MDM, Fleet deletes the SCEP CA keypair, along with a few other things like the APNS certificate. It doesn't delete any Apple profiles(entries in the mdm_apple_configuration_profiles) table. So there is then, potentially, a Filevault profile sitting in the DB with an old public key for which the server has no private key. If Apple MDM is re-enabled, a new keypair is generated. Any teams with disk encryption enabled(or even disabled and re-enabled) after this get a Filevault profile including the appropriate public key, however those created prior will have the old Filevault profile stick around and any hosts that attempt to escrow a key using it will escrow a key that is undecryptable by the Fleet server.

In customer-fiorella's case, Team 1 and no-team have an old profile prior to MDM being disabled and reenabled in April and Teams 5 and 11 have a good post-enablement profile

To fix this for the customer in a timely manner we can run a few DB queries to fix the broken profiles(on team id=1 and "no team" according to my DB queries), something along the lines of

UPDATE mdm_apple_configuration_profiles SET mobileconfig = (SELECT mobileconfig FROM mdm_apple_configuration_profiles WHERE team_id=11 AND identifier='com.fleetdm.fleet.mdm.filevault'), checksum=(SELECT checksum FROM mdm_apple_configuration_profiles WHERE team_id=11 AND identifier='com.fleetdm.fleet.mdm.filevault'), uploaded_at=(SELECT uploaded_at FROM mdm_apple_configuration_profiles WHERE team_id=11 AND identifier='com.fleetdm.fleet.mdm.filevault');

We need to do some testing but I believe something along these lines will work.

Longer term in the product we need to add code to delete certain fleet-created profiles when MDM is disabled. We should leave customer profiles alone but any profiles that depending on the CA should be deleted. Likewise we should review our CA usage for any other changes we may need to make when the assets(keypair) are deleted

JordanMontgomery avatar Oct 17 '25 20:10 JordanMontgomery

@spalmesano0 @lukeheath customer-fiorella has updated us here https://fleetdm.slack.com/archives/C08KYP9L8SY/p1761063641394089?thread_ts=1759530884.384839&cid=C08KYP9L8SY that the fix above has seemingly resolved this issue. As the actual root cause and fix is very different from the issue title, after discussing in standup we would like to take the following course of action:

  1. Remove the P1 label
  2. Tag with :product for input from @marko-lisica
  3. Remove from current MDM sprint until we get some product input

We've certainly put in the 5 estimate worth of effort as well already just in time to triage, diagnose, reproduce, followup with customer and fix so I suspect the actual longer term code fix should be re-estimaetd by the team now that we understand this was not an escrow buddy issue.

thoughts?

JordanMontgomery avatar Oct 21 '25 16:10 JordanMontgomery

Hey @noahtalerman, it turned out this wasn't an Escrow Buddy problem, but it's a side effect of turning Apple MDM off, cause profiles aren't removed from the DB. When Apple MDM was turned on again, the old profile and FileVault certificate were wrong, so Escrow Buddy couldn't escrow the key.

I removed the P1 label.

I added repro steps to reflect what actually happens and changed the title.

I think we can file a feature request to improve this behavior when Apple MDM is turned off. Wdyt?

marko-lisica avatar Oct 21 '25 17:10 marko-lisica

@JordanMontgomery sounds good to me and thank you for the work on this!

spalmesano0 avatar Oct 21 '25 20:10 spalmesano0

@marko-lisica thanks! Agreed this doesn't deserve P1 but I think it's still a bug.

I think we should be able to turn Apple MDM off and on as much as we want. And, MDM features should work as expected.

noahtalerman avatar Oct 22 '25 16:10 noahtalerman

@noahtalerman makes sense. I think we can just solve this problem where the disk encryption profile is invalid after MDM is turned back on. Will specify to fix.

marko-lisica avatar Oct 23 '25 11:10 marko-lisica

When MDM is turned off, and turned back on, Fleet should adjust FileVault profiles and redeploy them so they work with new SCEP CA.

@JordanMontgomery I specified this as a fix. Does the end user need to take any action for the keys to be escrowed (restarting the computer)?

marko-lisica avatar Oct 23 '25 11:10 marko-lisica

@marko-lisica No if we implement this fix the user will not need to take any additional action(outside of our usual flows) since they'll have to re-enroll all devices in MDM anyways.

For existing customers that hit this, like customer-fiorella they should be able to direct their users to follow the instructions when prompted by the "my device" page to resolve things but I suspect there are few or no other customers in this state.

JordanMontgomery avatar Oct 27 '25 17:10 JordanMontgomery

After looking into this I can see that when turning off and then back on apple MDM a new file vault profile is sent to the host, so it seems Fleet is creating and redeploying the new profiles so they will work with the new SCEP CA. Despite this, I am not able to confirm that disk encryption feature is still working after this flow. I'm not able to retrieve the disk encryption key and it seems it is no longer being escrowed into fleet. below I have summerised the flow I am seeing:

  • The initial flow of enrolling a host and enabling disk encryption works. I can see the disk encrytion key is escrowed into fleet, and able to retrieve it via fleet UI or API
  • After turning off and on Apple mdm and re-enrolling the same host into Fleet MDM (without disabling disk encryption first), the device shows all expected profiles (including the new file vault profile) but Fleet continues to show the “enable disk encryption” banner. I have confirmed that file vault is still enabled and managed by Fleet on the host.
Image
  • The original escrowed key disappears from host_disk_encryption_keys and the column for base64_encrypted is now empty, and attempts to get the key via the UI or API return “resource not found,”
Image
  • Orbit logs on the host report failed to re-enable Escrow Buddy in the authorization database.
Image
  • Restarting the host and refetching even after a few hours doesn't resolve the issue. At this point it seems the disk encryption feature is no longer working (Fleet users can no longer get the disk encryption key) via the UI or API
  • The old key is still stored in the host_disk_encryption_keys_archive table

ghernandez345 avatar Dec 09 '25 12:12 ghernandez345

@MagnusHJensen im gonna move back to ready status as the behaviour im seeing is similar to what the original big describes

ghernandez345 avatar Dec 09 '25 12:12 ghernandez345

This looks good in my testing. Turned MDM on, verified profile existed, turned it back off, turned it on again, verified via DB profile was updated, enrolled encrypted host, logged out, verified encryption key got archived Image

JordanMontgomery avatar Dec 12 '25 17:12 JordanMontgomery

FileVault's glow dims, Fleet adjusts, redeploying, Secure keys resurge.

fleet-release avatar Dec 19 '25 23:12 fleet-release