fleet icon indicating copy to clipboard operation
fleet copied to clipboard

DDM profiles are taking a long time to verify after editing

Open allenhouchins opened this issue 1 year ago • 3 comments

Fleet version: 4.66.0

Web browser and operating system: macOS


💥  Actual behavior

It is taking an unusually long time for DDM profiles to go from verifying to verified. Often well beyond an hour.

🧑‍💻  Steps to reproduce

  1. Create a DDM profile, deploy it to a device, and observe status in Fleet UI
  2. Once it is verified, update the profile with a new setting and re-deploy (via GitOps)
  3. Observe that the profile takes a long time to go from verifying to verified

🕯️ More info (optional)

Clicking resend on a mobileconfig configuration profile results in a responsive status (seconds/minutes) of going from verifying to verified. DDM configuration profiles do not.

More info in this Slack thread: https://fleetdm.slack.com/archives/C03C41L5YEL/p1744048572372529

🛠️ To fix

Product designer @marko-lisica:

The profile should be verified after the declaration profile is edited (via GitOps) as soon as we get status response from the host. AFAIK host automatically sends a status report when the declaration is added or edited.

allenhouchins avatar Apr 08 '25 14:04 allenhouchins

Slack thread screenshot (so we don't loose information)

Image

marko-lisica avatar Apr 09 '25 11:04 marko-lisica

  1. Create a DDM profile, deploy it to a device, and observe status in Fleet UI
  2. Once it is verified, update the profile with a new setting and re-deploy (or click resend)
  3. Observe that the profile takes a long time to go from verifying to verified

@noahtalerman @allenhouchins "(or click resend)" this part confuses me. We have a story to disable Resend button for declaration profiles, as that's not possible. I think we should remove that from the repro steps.

If I understand, the problem here is, when the declaration profile is updated (additional password requirement added), it doesn't verify, which should be because the profile is different, and the host should send a DDM status report back as soon as something is changed.

marko-lisica avatar Apr 09 '25 11:04 marko-lisica

Request was for under 5m for normal online device.

georgekarrv avatar Apr 09 '25 17:04 georgekarrv

Had to kick this back out of the sprint to make room for https://github.com/fleetdm/fleet/issues/24475

georgekarrv avatar Apr 18 '25 15:04 georgekarrv

This state is easy to achieve -- when we delete and immediately add back the same profile while the device is offline. Image

getvictor avatar May 09 '25 22:05 getvictor

@georgekarrv Marked as P2 per this slack conversation: https://fleetdm.slack.com/archives/C03C41L5YEL/p1747088827683919

getvictor avatar May 13 '25 15:05 getvictor

DDM flows are complicated and hard to maintain. Filed an eng-initiated story to improve the situation: #29132

getvictor avatar May 14 '25 14:05 getvictor

QA Test results:

I ran thru two scenarios in 4.68 and was able to reproduce the bug:

  1. Delete then re-add the ddm profile while the host is offline
  2. GitOps workflow mentioned in the bug details

✅ Once I applied the patch, the profiles went back to verified within a few seconds in both scenarios.

Leaving in Awaiting QA until load tests are completed

PezHub avatar May 29 '25 19:05 PezHub

Discovered an issue with DDM profiles and osquery_perf hosts so we could not perform load tests. We have a fix for next sprint #29973 and I plan to revisit

PezHub avatar Jun 13 '25 16:06 PezHub

In the cloud city's glow, Profiles verify with ease, Swift like the wind's flow.

fleet-release avatar Jul 15 '25 15:07 fleet-release