DDM profiles are taking a long time to verify after editing
Fleet version: 4.66.0
Web browser and operating system: macOS
💥 Actual behavior
It is taking an unusually long time for DDM profiles to go from verifying to verified. Often well beyond an hour.
🧑💻 Steps to reproduce
- Create a DDM profile, deploy it to a device, and observe status in Fleet UI
- Once it is verified, update the profile with a new setting and re-deploy (via GitOps)
- Observe that the profile takes a long time to go from verifying to verified
🕯️ More info (optional)
Clicking resend on a mobileconfig configuration profile results in a responsive status (seconds/minutes) of going from verifying to verified. DDM configuration profiles do not.
More info in this Slack thread: https://fleetdm.slack.com/archives/C03C41L5YEL/p1744048572372529
🛠️ To fix
Product designer @marko-lisica:
The profile should be verified after the declaration profile is edited (via GitOps) as soon as we get status response from the host. AFAIK host automatically sends a status report when the declaration is added or edited.
Slack thread screenshot (so we don't loose information)
- Create a DDM profile, deploy it to a device, and observe status in Fleet UI
- Once it is verified, update the profile with a new setting and re-deploy (or click resend)
- Observe that the profile takes a long time to go from verifying to verified
@noahtalerman @allenhouchins "(or click resend)" this part confuses me. We have a story to disable Resend button for declaration profiles, as that's not possible. I think we should remove that from the repro steps.
If I understand, the problem here is, when the declaration profile is updated (additional password requirement added), it doesn't verify, which should be because the profile is different, and the host should send a DDM status report back as soon as something is changed.
Request was for under 5m for normal online device.
Had to kick this back out of the sprint to make room for https://github.com/fleetdm/fleet/issues/24475
This state is easy to achieve -- when we delete and immediately add back the same profile while the device is offline.
@georgekarrv Marked as P2 per this slack conversation: https://fleetdm.slack.com/archives/C03C41L5YEL/p1747088827683919
DDM flows are complicated and hard to maintain. Filed an eng-initiated story to improve the situation: #29132
QA Test results:
I ran thru two scenarios in 4.68 and was able to reproduce the bug:
- Delete then re-add the ddm profile while the host is offline
- GitOps workflow mentioned in the bug details
✅ Once I applied the patch, the profiles went back to verified within a few seconds in both scenarios.
Leaving in Awaiting QA until load tests are completed
Discovered an issue with DDM profiles and osquery_perf hosts so we could not perform load tests. We have a fix for next sprint #29973 and I plan to revisit
In the cloud city's glow, Profiles verify with ease, Swift like the wind's flow.