fleet Deploy security agents to macOS, Windows, and Linux hosts

Goal

User story
As an IT admin using the Software page, the Fleet API, or GitOps,
I want to add my security agents
so that I can deploy them to my macOS, Windows, and Linux hosts.

✅ Cross-platform app deployment

See a video walkthrough of the user journey in this Loom video.

Context

Product designer: @marko-lisica

Changes

Product

[ ] UI changes: Figma link
[ ] CLI usage changes: Changes specified in Figma
[ ] REST API changes: #17865
[ ] Permissions changes:
- Maintainers and admins (team and global) can view, add, download, and delete software installer. GitOps user can manage software via Fleet YAML. (Team roles can do specified actions to software uploaded to their team(s) and can add software to teams they are assigned to.)
- Maintainers and admins (team and global) can install software on a specific host. (Team roles can do install software on hosts assigned to their team(s))
- Observers (team and global) can view software installer(Team observers can view only software installer uploaded to their team(s).)
[ ] Outdated documentation changes: Write an article/guide about installing software with Fleet.
[ ] Other requirements:
- Available in Fleet Premium only
- Software can only be installed on a host that has a fleetd agent with scripts enabled
- Doesn't require turning on MDM features in Fleet
- There won't be an option to make an app/package available for self-service. This will be an iterative improvement coming later.
- Fleet won't uninstall apps/packages if they're deleted in Fleet.
- Software installation should be part of the same queue as MDM commands and scripts

Engineering

DB

[ ] Database schema migrations: TODO
[ ] New concept: a "software installer", maps to software_titles (and presumably, via the version, to software)
[ ] Storage: software installer can be up to 500MB, where/how do we store this?
- Filesystem is default storage. No new config to opt in to filesystem storage
- S3 storage optional. Same config as filecarves (docs here)

Backend (general)

[ ] Extract software name and version from installer package (.app, .pkg, .msi, etc.)
[ ] Probably the biggest challenge: installing software can require a profile to be deployed first as a precondition, and a script to run next as a postcondition, with a retry strategy.
- i.e. not just a strong ordering of actions on a host, but conditional steps (previous step must succeed before moving to next step, otherwise some form of retries), basically an orchestration.
[ ] This validation requires extracting the name/version from the package, which means it would happen after the (potential long and heavy) upload. Ideas to avoid it (would a frontend logic for extraction be even possible?)
[ ] We need to track the state not just of individual steps (the "before" profile, the installer itself, the "after" script) but of the sequence as a whole, in order to show the status of the software on the host (and filter on failed orchestrations in "List hosts", etc.).

CLI

[ ] Via fleetctl, we get the URL of the software installer which we must send to fleet (sending the URL to Fleet and having the fleet server download it results in less bytes transfer - we save the big upload -, but could hit firewall restrictions? Otherwise, fleetctl downloads and uploads to fleet)

API

[ ] New device-authenticated API endpoint to list software installed on the device, along with status of the whole orchestration.
[ ] New parameter to filter "List Hosts" by status of the "managed software orchestration"
[ ] New user-authenticated API endpoint to list software installed on a host, along with status of the whole orchestration.
[ ] New user-authenticated API endpoints to upload, download, delete, list software titles (paginated), get a specific software title, and get a specific software version

ℹ️ Please read this issue carefully and understand it. Pay special attention to UI wireframes, especially "dev notes".

Context

Requestor(s): _________________________

QA

Risk assessment

Requires load testing: TODO
Risk level: Low / High TODO
Risk description: TODO

Manual testing steps

Step 1
Step 2
Step 3

Testing notes

Confirmation

[ ] Engineer (@____): Added comment to user story confirming succesful completion of QA.
[ ] QA (@____): Added comment to user story confirming succesful completion of QA.

Nov 03 '23 13:11 noahtalerman

Hey @marko-lisica I think we need a confirmation modal to describe to the user what happens when they delete a software item:

What happens? The package isn't removed/uninstalled from any hosts. It's no longer installed on hosts that enroll to Fleet.

Nov 25 '23 19:11 noahtalerman

At a quick glance, this looks like a great start!

Nov 28 '23 00:11 mikermcneil

Hey @marko-lisica! Leaving feedback here:

At 1:15 in your 1st Loom recording, you talk about the API sending small chunks to the Fleet server.
- What does this mean for the user? Packages will actually take 5 seconds to upload? 10 seconds?
- Are you proposing we rely on the generic "Couldn't upload. Please try again" if there's a timeout for some reason? (b/c slow connection)
Nice scripts! 🔥 Looks like there's an improvement we can make to the output rendering (looks broken):
- Can you please file a separate feature request for this improvement and add it to the feature fest board?
Around 4:40 in your 1st Loom, you mentioned that we can't get meta data from the .dmg.
- It looks like we might be able to? There's a # Install the .app step in your install-dmg.sh script.
- If we have the .app in this step maybe we can get the metadata we need?
At the beginning of your 2nd Loom, you mention that it's easy for the IT admin to get the .app from a .dmg manually.
- It looks like it might be easy for us to do this for the IT admin? If I'm understanding correctly, your install-dmg.sh script does this for the IT admin already (I could be missing something).
- If that's right, then maybe we just support .dmg and not .app so it's easier for IT admin?
UI and API feedback: https://www.loom.com/share/fb98461143a841ee8937531d822ecee1?sid=afe431c8-d143-4cc8-ba9f-380d80ebd6b6

Dec 07 '23 23:12 noahtalerman

At 1:15 in your 1st Loom recording, you talk about the API sending small chunks to the Fleet server. What does this mean for the user? Packages will actually take 5 seconds to upload? 10 seconds?

So basically I understood this way. If you have large file of 500 MB, and slow internet connection, it may take 20 minutes to upload that file. Even if timeout is set to 3 minutes it won't interrupt upload. Multipart form-data works in a way that it sends small chunks of that file each time. If that chunk upload takes more than 3 minutes then it will interrupt upload. I think we were spending a lot of time thinking about it, but it's not likely it will happen. That's how I understood @roperzh, could you confirm?

Looks like there's an improvement we can make to the output rendering (looks broken). Can you please file a separate feature request for this improvement and add it to the feature fest board?

Done ✅ - #15515

Around 4:40 in your 1st Loom, you mentioned that we can't get meta data from the .dmg. It looks like we might be able to? There's a # Install the .app step in your install-dmg.sh script. If we have the .app in this step maybe we can get the metadata we need?

The user will upload .dmg, so we can't get metadata after upload and do matching by name. If there's way to extract .dmg and get .app from it on the server, maybe that's how it could work.

At the beginning of your 2nd Loom, you mention that it's easy for the IT admin to get the .app from a .dmg manually. It looks like it might be easy for us to do this for the IT admin? If I'm understanding correctly, your install-dmg.sh script does this for the IT admin already (I could be missing something). If that's right, then maybe we just support .dmg and not .app so it's easier for IT admin?

As I mentioned in the previous answer, all of this happens on macOS, we need metadata after upload to do matching. IT admin could take .app from .dmg and upload to Fleet, which is very easy. The problem is with GitOps workflow, they won't be able to use URLs from vendors that (if they are .dmg files). It would be better if we could support dmg (to have easier GitOps workflow), but it depends if we can do this.

@roperzh Wdyt?

Dec 08 '23 15:12 marko-lisica

If there's way to extract .dmg and get .app from it on the server, maybe that's how it could work.

@marko-lisica let's remember to bring this up w/ engineering folks on our next call.

Dec 12 '23 00:12 noahtalerman

Hey @marko-lisica left some UI feedback for you in Loom here: https://www.loom.com/share/f52089d259d9403a9c7055f3cc55c1a4?sid=1536e398-e616-44ff-934c-b4d77591ce7d

Dropped API feedback in the PR: https://github.com/fleetdm/fleet/pull/15242/files

Dec 12 '23 00:12 noahtalerman

Feedback from Mike:

Skip the "X" and the "Cancel" button because we'll need to add some API for cancel.
No escape key or click outside affordance.
Use native browser onbeforeunload() during upload. Remove this always: after success and failure
2 minutes instead of 4 minutes cut off. Important that it doesn't create a ghost software entry. Nothing gets stored.
Make max upload size smaller. Maybe 200 MB?
- Plan is to later add some config option to alternatively store the packages in a location other than the database (ex. S3). Plan to go to market w/ storage in DB. Let us know if we're wrong.
Does doing the double check mark later add more work?
If we're adding the ability to install software at any time, what do IT admins expect to happen? What's the end user experience if they're using the app?

Dec 13 '23 15:12 noahtalerman

Hey @marko-lisica let's kicking this one to next sprint so we can focus on "upcoming activities" and the smaller stories this sprint.

We'll get to this one next design sprint.

Dec 22 '23 20:12 noahtalerman

From design review doc:

DISCUSS: https://github.com/fleetdm/fleet/pull/15242#discussion_r1424133848 What about the download endpoint, it would be GET /software?alt=media (we decided to hide /software, would it conflict if we decide to get metadata of managed software?) DISCUSS Marko: Add teams filter for software version details view. API adjustments The only difference based on the selected team would be hosts_count,vulnerabilities are related only to the version Parameter description? DISCUSS Marko: Add new software title view to My device page. Needs a new API endpoint for My device page. Marko: Public or contributor API? Meeting w/ engineering: Bri: Vulnerable software filter on host details software table George: Filter for managed software on host details Moving icons to the left (status icons for managed software)

Jan 09 '24 16:01 noahtalerman

Heads up @marko-lisica and @mikermcneil this request was discussed during feature fest last week and didn't make it into the current design sprint.

Jan 10 '24 14:01 noahtalerman

https://fleetdm.slack.com/archives/C019WG4GH0A/p1707437681949749

Feb 12 '24 20:02 nonpunctual

Hey @marko-lisica in a Loom video here, I chat about new learnings and updates since we were last working on this story.

Looking forward to chatting more tomorrow.

Feb 29 '24 00:02 noahtalerman

Ubuntu = deb & snap packages https://ubuntu.com/about/packages

Feb 29 '24 13:02 nonpunctual

@noahtalerman @marko-lisica Totally correct assessment that these "agent" type installs are harder.

But, some good news about UPDATING security agent packages is that MANY of them update themselves from the tenant once installed & this is usually preferred by client platform & security teams.

The reason is that client platform teams only have to be involved with the initial package roll-out. If auto-updates are then enabled from the tenant, the security team reclaims control of the version (which is usually how that ball bounces...)

Not universally true but definitely the trend I saw with:

Digital Guardian Tanium FireEye Symantec Systrack Delinea Privilege Manager Zscaler & others (blanking on names)

Feb 29 '24 13:02 nonpunctual

client platform teams only have to be involved with the initial package roll-out.

@nonpunctual got it!

Sounds like our initial pass just needs to solve for initial deployment: script/profile gets delivered followed by the package.

Feb 29 '24 14:02 noahtalerman

1 last shower thought comment on this: the one place usually where CPE teams need to worry about updates is in their provisioning workflows, ie, there can be drift between the version of the package that gets installed when a computer is provisioned & the version that the tenant is deploying (usually the most up-to-date.) If those drift too far apart, the package needs to be updated in the provisioning. Hope that makes sense.

Feb 29 '24 15:02 nonpunctual

Hey @marko-lisica we're planning on addressing 2 more user stories within the software management "realm": updating/patching and self-service.

I think the solution for both of these use cases will include an interface (UI/API/CLI) to trigger software installation on a specific host at the requested time.

So, why not build that interface now? And, instead of automatically installing the software on every host on that team, leave it up to the IT admin to trigger the install. They can use failing policies webhook + Tines to automate this.

I think it will also let us move faster addressing this story.

Probably the biggest challenge: installing software can require a profile to be deployed first as a precondition, and a script to run next as a postcondition, with a retry strategy

I think we might be able to sidestep this challenge and take this on in a later iteration. This will help us move faster now and give us dedicated time solve the inevitable edge cases later.

Instead, we can block (return an error) if the IT admin hits the API to install a security agent on a host that doesn't already have the profile installed. This is a problem with an easy to understand solution: make sure the profile is installed before you try again.

We need to track the state not just of individual steps (the "before" profile, the installer itself, the "after" script) but of the sequence as a whole, in order to show the status of the software on the host (and filter on failed orchestrations in "List hosts", etc.).

We might not need to do more work to track the profile. We already have it's status (pending, failed, verifying, verified).

If installing security agents is indeed idempotent (you can install over a successful/failed install) then we might not need to track whether the install or the script was unsuccessful. Instead, we can track "was the installation "as a whole" (install + script) successful.

An unsuccessful install is another problem with an easy to understand solution: try again.

And, if we learn that security agents aren't idempotent, I assume there are scripts to remove/uninstall failed or successful installs. So, there's an extra step to solving the unsuccessful install, run a script.

Mar 27 '24 14:03 noahtalerman

Hey @pintomi1989 heads up, we didn't get this one estimated in the last design sprint.

Plan is to bring it into the next design sprint (4.49).

Bringing this to feature fest.

Mar 28 '24 18:03 noahtalerman

Hey @marko-lisica, I left some feedback on the UI/CLI changes in a Loom here: https://www.loom.com/share/867428f755b64bd9ac620fe197a3a19c?sid=7b2eb119-dfaa-4e04-8876-5934b7545748

Apr 01 '24 21:04 noahtalerman

From product office hours:

Brock:

https://learn.jamf.com/bundle/jamf-cloud-distribution-service-release-notes/page/Release_History.html
https://derflounder.wordpress.com/2024/02/24/using-the-jamf-pro-api-to-download-installer-packages-from-a-jcds2-distribution-point/
https://developer.jamf.com/jamf-pro/reference/get_v1-jcds-files

Basically that Jamf now has a new & improved cloud distribution point & we should look to its features to make something similar. the Der Flounder article is about customizing the new JDCS with solutions like munki & another article I think I posted way back from a Jamf PS guy discussed BYPASSING a Jamf DP for packages.

In other words, I think the Fleet MVP for packages could just be: give the Host a secure, encrypted link to a URL where packages are available. That's it. Not rebuilding munki or any other system, just get a Fleet Distribution point up & running & as long as it has a cert & URL, move on. :)

Apr 02 '24 14:04 noahtalerman

Filesystem is default storage. No new config to opt in to filesystem storage

S3 storage optional. Same config as filecarves (docs here)

Hey @mna, this is the plan for package storage location. I updated the issue description.

More context including the "why" here in this Google doc.

What do you think?

cc @lukeheath @georgekarrv @rfairburn

Apr 11 '24 14:04 noahtalerman

@noahtalerman My primary concern is that we are very clear that the filesystem is the default storage and encourage users to configure S3 in production. Otherwise, they may fill the server's file system memory and crash Fleet.

Apr 12 '24 17:04 lukeheath

@lukeheath, I think we document S3 as required.

Filesystem is there for trials (fleetctl preview) and dev environments.

Apr 12 '24 20:04 noahtalerman

@noahtalerman

Hey @mna, this is the plan for package storage location. I updated the issue description.

More context including the "why" here in this Google doc.

What do you think?

Sounds good to me, looks like we already piggy-back on the S3 config to store fleet-osquery installers (in addition to carves, but the use-case here is more similar to the fleet-oquery installer). We'll want to implement a similar abstraction (a common interface) and have it implemented for both local filesystem and S3 config, so that we don't have to worry about which is used internally.

Apr 15 '24 14:04 mna

Something that came up during a design review today: the designs don't make it clear that the install script command will be different depending on OS, and whether this should be pre-filled. @lukeheath is going to discuss with @georgekarrv and make a final decision about what to do for this UI. Screenshot 2024-05-03 at 11 48 21 AM

May 03 '24 17:05 rachaelshaw

Since pkg is only for macOS,exe and msi is only windows and deb is only Debian Linux there isn't an issue here currently

May 03 '24 17:05 georgekarrv

Discussed with Luke that one point to make it more obvious that it's edit-able will be changing the text input to one that matches closer to the query input below (lined numbers w/ wrap etc) and changing the word command in the underlying text to script and autofilling it with

#!/bin/sh
installer -pkg ./FalconSensor-6.44.pkg -target /

The newline and she-bang will make it more obvious w/o any other words that this is a script, you can add more or less and the line numbers will make it more clear that you can edit it.

May 03 '24 19:05 georgekarrv

We decided to cut the ability to specify install_script, pre_install_query, and post_install_script inline: Screenshot 2024-05-06 at 10 43 45 AM

Why?

Gitops flow replaces variables that start with $ in YAML with GitHub env variables
We’ll need to escape all variables defined by us and the IT admin
Currently not possible to escape until this is fixed https://github.com/fleetdm/fleet/issues/18467

We also decided to add a path sub-key to install_script. Like we’re adding for pre_install_query and post_install_script.

This way, the interface is consistent and we’re setting ourselves up for specifying these inline later.

The CLI wireframes in Figma are updated to reflect this.

cc @roperzh

May 06 '24 14:05 noahtalerman

Summary of the discussion and decisions during today's MDM standup

Blockers

The follow are no longer blockers. I removed them from the issue description.

[ ] TODO: Determine the three commands. Maybe 4 because different flavors of unix. Understand their interfdace, how they are called. Document this in the wireframes.

[ ] TODO: Luke will work with team to sort out what to do about the read only vs editable aspects, erring on the side of a 2-way door if possible (not a 1-way door that's hard to migrate out of). In other words, make it so that you can edit less-- you can always make it more customizable, but hard to go the other direction. If necessary/appropriate/soundest, cut the ability to configure even CLI opts.

Sort on Host details (My device) > Software table

@ghernandez345 and @mna we'd like to update the default sort on the Software table on the Host details and My device pages to name ascending.

Name column to be sortable. In addition, we can cut the special sort for the Install status column. Figma is updated:

Screenshot 2024-05-06 at 12 48 52 PM

Why? The Fleet UI convention is to have a default sort that is always visible on the page. This is feedback from Mike that product design forgot to relay to y'all. Sorry about the last minute heads up.

Add software modal

@ghernandez345 we decided to use the ace editor and update the help text for the Install script input field. Figma is updated:

Screenshot 2024-05-06 at 12 52 28 PM

On the Advanced options modal (for uploaded software) let's remove for the Install script input field:

Screenshot 2024-05-06 at 12 53 04 PM

Heads up @roperzh, we want to add the #!/bin/sh shebang to the default pkg script so that it's obvious that this is a shell script.

FYI @lukeheath, @ghernandez345 we decided that Gabe will be taking on these UI changes.

Disabling deploy security agents if scripts are disable

We decided to allow deploying software. We can follow up in a later pass to add an off switch for deploying software.

@roperzh just checking, if a host has the fleetd agent w/ scripts disabled, will that prevent the ability to deploy software (and run an arbitrary script) on a host?

Providing a custom install script

We decided to go w/ the environment variable approach. When customizing the install script, the IT admin will specify the software's location as an environment variable in their script. Fleet will populate this variable for them.

installer -pkg "$INSTALLER_PATH" -target /

cc @georgekarrv

May 06 '24 17:05 noahtalerman

@noahtalerman Thanks; I agree UI changes should go to @ghernandez345.

May 06 '24 17:05 lukeheath

fleet fleet copied to clipboard

Deploy security agents to macOS, Windows, and Linux hosts

Goal

Context

Changes

Product

Engineering

DB

Backend (general)

CLI

API

Context

QA

Risk assessment

Manual testing steps

Testing notes

Confirmation

Summary of the discussion and decisions during today's MDM standup

Blockers

Sort on Host details (My device) > Software table

Add software modal

Disabling deploy security agents if scripts are disable

Providing a custom install script

fleet
fleet copied to clipboard