fleet Policy automations: install software

Goal

User story
As an IT admin,
I want to install software automatically when a host fails a policy
so that I can deploy software to many hosts without having to use 3rd party automation tool (e.g. Tines).

Context

Requestor(s): @nonpunctual
Product designer: @marko-lisica

Changes

Product

[ ] UI changes: Figma link
[ ] CLI usage changes: Figma link
[ ] REST API changes: #21418
[ ] Reference documentation changes: TODO (Document changes about new "No team" file and how policies and software for No team is specified in it.)
[ ] Changes to paid features or tiers: Available to Fleet Premium users

Engineering

[ ] Usage documentation changes: How/where Fleet extracts name and version from packages. This way, if the IT admin hits this error they can understand why Fleet can't get the version and know how to fix their package.
[ ] Database schema migrations: TODO
[ ] Load testing: TODO

ℹ️ Please read this issue carefully and understand it. Pay special attention to UI wireframes, especially "dev notes".

QA

@noahtalerman:

If an App Store app is installed and then later uninstalled on an iOS/iPadOS host, check to make sure it doesn't show up on that Host's host details page anymore and the software counts are updated accordingly.

Load test

The osquery-perf agents are able to simulate software installation. They have a 5% fail rate by default. See cmd/osquery-perf/README.md how to adjust pre/install/post fail probabilities. Once installed, the software will show up on the host with the next refetch.

Given a ~100 MB install package, try to automatically install software on 100,000 hosts.

Jun 05 '24 22:06 nonpunctual

Hi @noahtalerman @nonpunctual ,

No much to add to this one, for context :

We will use labels to group hosts
We would like to upload software without any teams related to it, aka "no team"
We would like to be able to say : "I want the hosts that have this labels to have access to this list of app"

Since we are the source of truth for labels, A workaround could be to use your API to do software installs based on our internal labels

Jun 18 '24 08:06 valentinpezon-primo

Contributes to parity with Jamf

Jun 20 '24 20:06 noahtalerman

Thanks @valentinpezon-primo for the info!

Jun 20 '24 22:06 noahtalerman

Converting this issue to story format and moving original description here:

Organizations may have the need to install applications based on:

role
persona
job title
department
organizational unit
LDAP group
etc...

i.e., a grouping of Hosts or end users that does not align to a Team in Fleet.

Scenario:

Customer-preston does not or can't use Teams in Fleet
- They would like applications to be assigned to "No Team"
- (see: https://github.com/fleetdm/fleet/issues/19550)

If we do this, the only options for application install in the case where a customer does not use Teams would be:

install apps for every device in the fleet (i.e., "No Team")
install apps for 0 devices in the fleet (i.e., applications would not be assignable)

Problem

If applications can only be assigned to a Team, multiple Teams, "All Teams" or "No Team", how would a Fleet customer make an application assignment from the list above that is not aligned with a Team?

Potential solutions

Allow applications to be assigned to Hosts that match a Label.

Jul 16 '24 07:07 marko-lisica

For checking a host's version using osquery queries:

Noah: Might need to use some CAST(“bundle_identifer” AS INT)
- https://github.com/fleetdm/fleet/issues/15962
- https://github.com/osquery/osquery/pull/8168 Marko: I found these as well:
- https://osquery.readthedocs.io/en/stable/introduction/sql/#:~:text=Collations-,version,-%3A
- Version compare function

Jul 25 '24 15:07 noahtalerman

If you find some time you can record feedback on the UI that we didn't look at (labels badge on software details and advanced options modal). I would like to hear more, why do you think we want to split pending status to pending to install and pending "verification".

I believe it would be a better experience if we could manage to refetch host info if online and know right away if the host has software + if we can update software inventory together with host refetch, so counts are matching.

Hey @marko-lisica, I recorded a Loom video w/ feedback and thoughts on the above here (internal).

Jul 25 '24 20:07 noahtalerman

TODO @noahtalerman : Merge in software API changes when we ship this story so that API is up to date, then add install, labels_exclude_any and labels_include_any to POST /api/v1/fleet/software/batch and POST /api/v1/fleet/spec/teams

Jul 31 '24 17:07 marko-lisica

TODO @noahtalerman:

Add wireframes/specs for adding new statuses +"Verified" logic for App Store apps
What happens to the status when the app is deleted by the end user (or some script)?

cc @marko-lisica

Jul 31 '24 18:07 noahtalerman

Consensus from/with @jacobshandling during our estimation meeting:

FE: ~21 total = 1-2 (Free) + 8 (Add software) + 5 (Software details) + 2-3 (Options modal) + 2 (Host software)

Jul 31 '24 19:07 RachelElysia

@RachelElysia, @jacobshandling, I updated the BE sub-tasks according what we discussed. TODO: agree between you on a proper division of #20897 into (at least) two sub-tasks to be developed by both of you.

Jul 31 '24 20:07 sharon-fdm

Maturity review notes:

Brock: Other products allow me to apply multiple layers of filters. So I can start with a group that is Mac Sonoma devices only. Then within that group, I can create additional subsets, like arm64, for example.

Noah: To do that with this feature, you'd have to create an individual label for every combination you want.

Brock: The way this feature is typically used is to have many layers of subsets, and while creating individual labels is maybe possible, it would be painful and complicated to implement.

Noah: It sounds like we'll need to revisit in a future iteration to think about layers of filters.

Noah: Two things we’re missing:

Include all (scoping on top of scoping)
Not in other products but would make us better: What am I scoping this to? When I choose labels what hosts are actually going to get software? Especially in the “include all” scenario

Aug 02 '24 16:08 lukeheath

FYI @marko-lisica and @getvictor, I met w/ @lucasmrod and @gillespi314 during design review and we made several decisions re the software verification loop:

Wait for the next refetch when an app goes to "pending" in this iteration (instead of triggering a refetch).

We don't have this functionality to refetch a set of hosts elsewhere in the app (yet)
The plan is to learn if this makes the install flow too slow. If it does then we can ship an improvement later.
In this iteration, the IT admin can hit "Refetch" on the Host details page (and in the API) to speed up install for a specific host.
Sarah: When we decide to make an improvement, instead of triggering a refetch, we could update the pre-install condition to include a check for the presence of software. This would also speed things up.

Screenshot 2024-08-07 at 9 48 53 AM

Add logic to retry installing the software once if it's in "verifying" state and then we find that software is missing or an older version is installed. If we've retried already, we'll move the software to failed with a different error message.

Screenshot 2024-08-07 at 9 52 48 AM

Screenshot 2024-08-07 at 10 21 05 AM

This way, we don't infinitely try to install the software and we notofy the IT admin when something is wrong. For example, the IT admin messed up the install script so that it always exists w/ 0 but the app never installs.
We'll reset the retry count if a GitOps user runs the GitHub action (fleetctl gitops) because the IT admin may have made a change to the install or edited some other option for the "failed" app. We want the app to go back to "pending."
Later when we get to the "Edit software" story (#20404) we'll also want to reset the retry count so that the IT admin can change the install script and Fleet tries again.

Aug 07 '24 17:08 noahtalerman

Hey @xpkoala heads up that I added this note to the QA section to make sure we're testing it as part of this story.

Also, more generally, do we fill out the QA section in stories? I've noticed the template is usually left alone. I just removed the template in this story.

If an App Store app is installed and then later uninstalled on an iOS/iPadOS host, check to make sure it doesn't show up on that Host's host details page anymore and the software counts are updated accordingly.

Aug 07 '24 22:08 noahtalerman

@noahtalerman What about VPP licenses? If a user deletes a VPP app or wipes the host, should we release the license?

Aug 08 '24 11:08 getvictor

What about VPP licenses? If a user deletes a VPP app or wipes the host, should we release the license?

@getvictor thanks for being loud about this. To stay focused and get #19551 shipped, I think we can follow up and add this next sprint. Here's the request for it:

#20729

Thoughts?

I'm assuming it's not harder to add this later v. now. And that adding it now will add a significant amount of work. I think I would rather get to some bugs.

Aug 08 '24 17:08 noahtalerman

How/where Fleet extracts name and version from packages. This way, if the IT admin hits this error they can understand why Fleet can't get the version and know how to fix their package.

Hey @sharon-fdm I think we want to include this as part of the guide for this feature.

Please feel free to schedule some time w/ me if you want a 5 minute run down!

Aug 09 '24 23:08 noahtalerman

@noahtalerman

Screenshot 2024-08-14 at 10 15 09 AM

Small thing. In the /software/titles/:id we should show the install type somewhere. Otherwise you don't know which software is automatic/manual after you create it. (Unless am missing something.)

Aug 14 '24 13:08 lucasmrod

Hey @lucasmrod, good call.

The plan is to show the install type in the Options modal. User gets here by clicking Actions > Show options on the Software title page:

Screenshot 2024-08-14 at 11 27 13 AM

Screenshot is from Figma here.

Aug 14 '24 18:08 noahtalerman

Ah I missed it. Thanks! LGTM!

Aug 14 '24 18:08 lucasmrod

How/where Fleet extracts name and version from packages. This way, if the IT admin hits this error they can understand why Fleet can't get the version and know how to fix their package.

Hey @sharon-fdm I think we want to include this as part of the guide for this feature.

Please feel free to schedule some time w/ me if you want a 5 minute run down!

@noahtalerman @sharon-fdm We have different extraction methods for each type of package. I think we should cover all of them (.pkg, .msi, .exe, .deb).

Aug 16 '24 10:08 marko-lisica

Hey @marko-lisica and @sharon-fdm heads up that I moved this issue from the release board to the drafting board because while it's in expedited drafting.

I assigned Marko.

Aug 17 '24 18:08 noahtalerman

For visibility, I'm pulling this out of Slack:

We decided to a change mid-sprint to simplify this app management feature.

The old plan was for the IT admin to choose "Automatic" install for an app. Fleet would detect, under-the-hood if the software is already installed.

Old wireframes are here: Screenshot 2024-08-17 at 12 42 28 PM

The new plan is to change the trigger for app install: policy failure.

New wireframes (still in progress):

Screenshot 2024-08-17 at 12 45 03 PM

This means that customers/users will now be “in the loop.” They have full control over when an app is installed w/ policies (no blackbox). And, this sets us up to keep this control when add UX improvements ("Automatic" abstraction on top of policies) later.

cc @dherder @alexmitchelliii

Aug 17 '24 19:08 noahtalerman

@noahtalerman On the design review today @spokanemac brought up concerns about the viability of this feature for managing a large library of software (more than 100). It's important that we distinguish that this approach is viable for a handful of "hero" software items, but not viable for updating all software across all hosts as part of vulnerability management. If a Fleet admin wanted to update all software across all hosts, they would still need to use something like Munki.

I suggested to @marko-lisica that we show these designs to IT customers to get their feedback on the viability for their use cases, as well as looping in @nonpunctual to get his feedback as he was the original submitter of this story, though it's changed quite a bit since it was first created.

Aug 20 '24 16:08 lukeheath

@marko-lisica I'll continue to dig in this week, but as I've compared this feature to existing features in other MDMs I think it's a good first step. Tying to policies for initial installation makes sense, and is similar to other MDMs use of "manual" software updates.

@spokanemac It's true that this feature doesn't provide automatic updates, which would make this arduous to use at scale, but the Fleet app library introduces the concept of automatic updates and that will be following on later this quarter.

Aug 20 '24 16:08 lukeheath

@lukeheath I agree with @spokanemac's assessment that designing this feature to install software to be based on policy failure becomes increasingly difficult to manage as a customer's software library grows.

I think customers who are asking for software to be installed automatically based on team membership expect it to work like custom settings/profiles do today.

Add host to team -> Fleet automatically installs software associated with that host -> host is moved to a different team -> Fleet removes software associated with the previous team and installs software associated with the new team

Can the desire to have more control over when software gets installed be solved with the existing pre-install condition feature? I'd like to learn more about what use cases using a policy as the trigger for installation solves.

customer-easterwood says:

• Would there be a possibility of auto installing all apps assigned to a specific team? • Similarly how we have it configured in Jmpcloud, we basically assign a set of apps to a "device group" and any devices associated with that group would have the apps installed automatically, would be nice to have.

I think the CS team should have more customer conversations to understand how they'd want this feature to work.

cc: @nonpunctual

Aug 21 '24 18:08 ddribeiro

So @lukeheath & I discussed this & this feature is supposed to be followed closely by https://github.com/fleetdm/fleet/issues/18865 - App Library

@spokanemac @ddribeiro I think this model gets close to Jamf + Munki / Jamf App Installers, i.e.,

there is a way to manually create a "policy" to deploy 1 application
there is automation native to Fleet based on Fleet Policies to keep deployed applications up-to-date

Aug 21 '24 19:08 nonpunctual

Yep! Fleet app library with automatic install and update is coming into the next sprint and is a Q3 deliverable.

Aug 23 '24 18:08 lukeheath

On design review last Friday we made the following decisions:

We are creating a new “no team” yaml file for GitOps.

This is where we will configure “no team” policies and software.
You can also include controls here optionally, or in default.yml, but not both.

We are going to remove the software property from default.yml, which is a breaking change.

We need to contact CS and make sure any customers that have adopted this feature are aware and can adjust (or we do it for them).

Aug 26 '24 19:08 lukeheath

cc @noahtalerman ^

Aug 27 '24 13:08 marko-lisica

I confirmed the "Application deployment" item in pricing-features-table.yml is listed as isExperimental. Do we need to update any reference docs to make sure it's clear this is an experimental feature?

Aug 27 '24 16:08 lukeheath

fleet fleet copied to clipboard

Policy automations: install software

Goal

Context

Changes

Product

Engineering

QA

Load test

Problem

Potential solutions

fleet
fleet copied to clipboard