fleet
fleet copied to clipboard
Deploy security agents to macOS, Windows, and Linux hosts
Goal
User story |
---|
As an IT admin using the Software page, the Fleet API, or GitOps, |
I want to add my security agents |
so that I can deploy them to my macOS, Windows, and Linux hosts. |
✅ Cross-platform app deployment
See a video walkthrough of the user journey in this Loom video.
Context
- Product designer: @marko-lisica
Changes
Product
- [ ] UI changes: Figma link
- [ ] CLI usage changes: Changes specified in Figma
- [ ] REST API changes: #17865
- [ ] Permissions changes:
- Maintainers and admins (team and global) can view, add, download, and delete software installer. GitOps user can manage software via Fleet YAML. (Team roles can do specified actions to software uploaded to their team(s) and can add software to teams they are assigned to.)
- Maintainers and admins (team and global) can install software on a specific host. (Team roles can do install software on hosts assigned to their team(s))
- Observers (team and global) can view software installer(Team observers can view only software installer uploaded to their team(s).)
- [ ] Outdated documentation changes: Write an article/guide about installing software with Fleet.
- [ ] Other requirements:
- Available in Fleet Premium only
- Software can only be installed on a host that has a fleetd agent with scripts enabled
- Doesn't require turning on MDM features in Fleet
- There won't be an option to make an app/package available for self-service. This will be an iterative improvement coming later.
- Fleet won't uninstall apps/packages if they're deleted in Fleet.
- Software installation should be part of the same queue as MDM commands and scripts
Engineering
DB
- [ ] Database schema migrations: TODO
- [ ] New concept: a "software installer", maps to
software_titles
(and presumably, via the version, tosoftware
) - [ ] Storage: software installer can be up to 500MB, where/how do we store this?
- Filesystem is default storage. No new config to opt in to filesystem storage
- S3 storage optional. Same config as filecarves (docs here)
Backend (general)
- [ ] Extract software name and version from installer package (.app, .pkg, .msi, etc.)
- [ ] Probably the biggest challenge: installing software can require a profile to be deployed first as a precondition, and a script to run next as a postcondition, with a retry strategy.
- i.e. not just a strong ordering of actions on a host, but conditional steps (previous step must succeed before moving to next step, otherwise some form of retries), basically an orchestration.
- [ ] This validation requires extracting the name/version from the package, which means it would happen after the (potential long and heavy) upload. Ideas to avoid it (would a frontend logic for extraction be even possible?)
- [ ] We need to track the state not just of individual steps (the "before" profile, the installer itself, the "after" script) but of the sequence as a whole, in order to show the status of the software on the host (and filter on failed orchestrations in "List hosts", etc.).
CLI
- [ ] Via
fleetctl
, we get the URL of the software installer which we must send to fleet (sending the URL to Fleet and having the fleet server download it results in less bytes transfer - we save the big upload -, but could hit firewall restrictions? Otherwise,fleetctl
downloads and uploads to fleet)
API
- [ ] New device-authenticated API endpoint to list software installed on the device, along with status of the whole orchestration.
- [ ] New parameter to filter "List Hosts" by status of the "managed software orchestration"
- [ ] New user-authenticated API endpoint to list software installed on a host, along with status of the whole orchestration.
- [ ] New user-authenticated API endpoints to upload, download, delete, list software titles (paginated), get a specific software title, and get a specific software version
ℹ️ Please read this issue carefully and understand it. Pay special attention to UI wireframes, especially "dev notes".
Context
- Requestor(s): _________________________
QA
Risk assessment
- Requires load testing: TODO
- Risk level: Low / High TODO
- Risk description: TODO
Manual testing steps
- Step 1
- Step 2
- Step 3
Testing notes
Confirmation
- [ ] Engineer (@____): Added comment to user story confirming succesful completion of QA.
- [ ] QA (@____): Added comment to user story confirming succesful completion of QA.
Hey @marko-lisica I think we need a confirmation modal to describe to the user what happens when they delete a software item:
What happens? The package isn't removed/uninstalled from any hosts. It's no longer installed on hosts that enroll to Fleet.
At a quick glance, this looks like a great start!
Hey @marko-lisica! Leaving feedback here:
- At 1:15 in your 1st Loom recording, you talk about the API sending small chunks to the Fleet server.
- What does this mean for the user? Packages will actually take 5 seconds to upload? 10 seconds?
- Are you proposing we rely on the generic "Couldn't upload. Please try again" if there's a timeout for some reason? (b/c slow connection)
- Nice scripts! 🔥 Looks like there's an improvement we can make to the output rendering (looks broken):
- Can you please file a separate feature request for this improvement and add it to the feature fest board?
- Around 4:40 in your 1st Loom, you mentioned that we can't get meta data from the
.dmg
.- It looks like we might be able to? There's a
# Install the .app
step in yourinstall-dmg.sh
script. - If we have the
.app
in this step maybe we can get the metadata we need?
- It looks like we might be able to? There's a
- At the beginning of your 2nd Loom, you mention that it's easy for the IT admin to get the
.app
from a.dmg
manually.- It looks like it might be easy for us to do this for the IT admin? If I'm understanding correctly, your
install-dmg.sh
script does this for the IT admin already (I could be missing something). - If that's right, then maybe we just support
.dmg
and not.app
so it's easier for IT admin?
- It looks like it might be easy for us to do this for the IT admin? If I'm understanding correctly, your
- UI and API feedback: https://www.loom.com/share/fb98461143a841ee8937531d822ecee1?sid=afe431c8-d143-4cc8-ba9f-380d80ebd6b6
At 1:15 in your 1st Loom recording, you talk about the API sending small chunks to the Fleet server. What does this mean for the user? Packages will actually take 5 seconds to upload? 10 seconds?
So basically I understood this way. If you have large file of 500 MB, and slow internet connection, it may take 20 minutes to upload that file. Even if timeout is set to 3 minutes it won't interrupt upload. Multipart form-data works in a way that it sends small chunks of that file each time. If that chunk upload takes more than 3 minutes then it will interrupt upload. I think we were spending a lot of time thinking about it, but it's not likely it will happen. That's how I understood @roperzh, could you confirm?
Looks like there's an improvement we can make to the output rendering (looks broken). Can you please file a separate feature request for this improvement and add it to the feature fest board?
Done ✅ - #15515
Around 4:40 in your 1st Loom, you mentioned that we can't get meta data from the .dmg. It looks like we might be able to? There's a # Install the .app step in your install-dmg.sh script. If we have the .app in this step maybe we can get the metadata we need?
The user will upload .dmg, so we can't get metadata after upload and do matching by name. If there's way to extract .dmg and get .app from it on the server, maybe that's how it could work.
At the beginning of your 2nd Loom, you mention that it's easy for the IT admin to get the .app from a .dmg manually. It looks like it might be easy for us to do this for the IT admin? If I'm understanding correctly, your install-dmg.sh script does this for the IT admin already (I could be missing something). If that's right, then maybe we just support .dmg and not .app so it's easier for IT admin?
As I mentioned in the previous answer, all of this happens on macOS, we need metadata after upload to do matching. IT admin could take .app from .dmg and upload to Fleet, which is very easy. The problem is with GitOps workflow, they won't be able to use URLs from vendors that (if they are .dmg files). It would be better if we could support dmg (to have easier GitOps workflow), but it depends if we can do this.
@roperzh Wdyt?
If there's way to extract .dmg and get .app from it on the server, maybe that's how it could work.
@marko-lisica let's remember to bring this up w/ engineering folks on our next call.
Hey @marko-lisica left some UI feedback for you in Loom here: https://www.loom.com/share/f52089d259d9403a9c7055f3cc55c1a4?sid=1536e398-e616-44ff-934c-b4d77591ce7d
Dropped API feedback in the PR: https://github.com/fleetdm/fleet/pull/15242/files
Feedback from Mike:
- Skip the "X" and the "Cancel" button because we'll need to add some API for cancel.
- No escape key or click outside affordance.
- Use native browser
onbeforeunload()
during upload. Remove this always: after success and failure - 2 minutes instead of 4 minutes cut off. Important that it doesn't create a ghost software entry. Nothing gets stored.
- Make max upload size smaller. Maybe 200 MB?
- Plan is to later add some config option to alternatively store the packages in a location other than the database (ex. S3). Plan to go to market w/ storage in DB. Let us know if we're wrong.
- Does doing the double check mark later add more work?
- If we're adding the ability to install software at any time, what do IT admins expect to happen? What's the end user experience if they're using the app?
Hey @marko-lisica let's kicking this one to next sprint so we can focus on "upcoming activities" and the smaller stories this sprint.
We'll get to this one next design sprint.
From design review doc:
DISCUSS: https://github.com/fleetdm/fleet/pull/15242#discussion_r1424133848 What about the download endpoint, it would be GET /software?alt=media (we decided to hide /software, would it conflict if we decide to get metadata of managed software?) DISCUSS Marko: Add teams filter for software version details view. API adjustments The only difference based on the selected team would be hosts_count,vulnerabilities are related only to the version Parameter description? DISCUSS Marko: Add new software title view to My device page. Needs a new API endpoint for My device page. Marko: Public or contributor API? Meeting w/ engineering: Bri: Vulnerable software filter on host details software table George: Filter for managed software on host details Moving icons to the left (status icons for managed software)
Heads up @marko-lisica and @mikermcneil this request was discussed during feature fest last week and didn't make it into the current design sprint.
https://fleetdm.slack.com/archives/C019WG4GH0A/p1707437681949749
Hey @marko-lisica in a Loom video here, I chat about new learnings and updates since we were last working on this story.
Looking forward to chatting more tomorrow.
Ubuntu = deb & snap packages https://ubuntu.com/about/packages
@noahtalerman @marko-lisica Totally correct assessment that these "agent" type installs are harder.
But, some good news about UPDATING security agent packages is that MANY of them update themselves from the tenant once installed & this is usually preferred by client platform & security teams.
The reason is that client platform teams only have to be involved with the initial package roll-out. If auto-updates are then enabled from the tenant, the security team reclaims control of the version (which is usually how that ball bounces...)
Not universally true but definitely the trend I saw with:
Digital Guardian Tanium FireEye Symantec Systrack Delinea Privilege Manager Zscaler & others (blanking on names)
client platform teams only have to be involved with the initial package roll-out.
@nonpunctual got it!
Sounds like our initial pass just needs to solve for initial deployment: script/profile gets delivered followed by the package.
1 last shower thought comment on this: the one place usually where CPE teams need to worry about updates is in their provisioning workflows, ie, there can be drift between the version of the package that gets installed when a computer is provisioned & the version that the tenant is deploying (usually the most up-to-date.) If those drift too far apart, the package needs to be updated in the provisioning. Hope that makes sense.
Hey @marko-lisica we're planning on addressing 2 more user stories within the software management "realm": updating/patching and self-service.
I think the solution for both of these use cases will include an interface (UI/API/CLI) to trigger software installation on a specific host at the requested time.
So, why not build that interface now? And, instead of automatically installing the software on every host on that team, leave it up to the IT admin to trigger the install. They can use failing policies webhook + Tines to automate this.
I think it will also let us move faster addressing this story.
Probably the biggest challenge: installing software can require a profile to be deployed first as a precondition, and a script to run next as a postcondition, with a retry strategy
I think we might be able to sidestep this challenge and take this on in a later iteration. This will help us move faster now and give us dedicated time solve the inevitable edge cases later.
Instead, we can block (return an error) if the IT admin hits the API to install a security agent on a host that doesn't already have the profile installed. This is a problem with an easy to understand solution: make sure the profile is installed before you try again.
We need to track the state not just of individual steps (the "before" profile, the installer itself, the "after" script) but of the sequence as a whole, in order to show the status of the software on the host (and filter on failed orchestrations in "List hosts", etc.).
We might not need to do more work to track the profile. We already have it's status (pending, failed, verifying, verified).
If installing security agents is indeed idempotent (you can install over a successful/failed install) then we might not need to track whether the install or the script was unsuccessful. Instead, we can track "was the installation "as a whole" (install + script) successful.
An unsuccessful install is another problem with an easy to understand solution: try again.
And, if we learn that security agents aren't idempotent, I assume there are scripts to remove/uninstall failed or successful installs. So, there's an extra step to solving the unsuccessful install, run a script.
Hey @pintomi1989 heads up, we didn't get this one estimated in the last design sprint.
Plan is to bring it into the next design sprint (4.49).
Bringing this to feature fest.
Hey @marko-lisica, I left some feedback on the UI/CLI changes in a Loom here: https://www.loom.com/share/867428f755b64bd9ac620fe197a3a19c?sid=7b2eb119-dfaa-4e04-8876-5934b7545748
From product office hours:
Brock:
- https://learn.jamf.com/bundle/jamf-cloud-distribution-service-release-notes/page/Release_History.html
- https://derflounder.wordpress.com/2024/02/24/using-the-jamf-pro-api-to-download-installer-packages-from-a-jcds2-distribution-point/
- https://developer.jamf.com/jamf-pro/reference/get_v1-jcds-files
Basically that Jamf now has a new & improved cloud distribution point & we should look to its features to make something similar. the Der Flounder article is about customizing the new JDCS with solutions like munki & another article I think I posted way back from a Jamf PS guy discussed BYPASSING a Jamf DP for packages.
In other words, I think the Fleet MVP for packages could just be: give the Host a secure, encrypted link to a URL where packages are available. That's it. Not rebuilding munki or any other system, just get a Fleet Distribution point up & running & as long as it has a cert & URL, move on. :)
- Filesystem is default storage. No new config to opt in to filesystem storage
- S3 storage optional. Same config as filecarves (docs here)
Hey @mna, this is the plan for package storage location. I updated the issue description.
More context including the "why" here in this Google doc.
What do you think?
cc @lukeheath @georgekarrv @rfairburn
@noahtalerman My primary concern is that we are very clear that the filesystem is the default storage and encourage users to configure S3 in production. Otherwise, they may fill the server's file system memory and crash Fleet.
@lukeheath, I think we document S3 as required.
Filesystem is there for trials (fleetctl preview) and dev environments.
@noahtalerman
Hey @mna, this is the plan for package storage location. I updated the issue description.
More context including the "why" here in this Google doc.
What do you think?
Sounds good to me, looks like we already piggy-back on the S3 config to store fleet-osquery installers (in addition to carves, but the use-case here is more similar to the fleet-oquery installer). We'll want to implement a similar abstraction (a common interface) and have it implemented for both local filesystem and S3 config, so that we don't have to worry about which is used internally.
Something that came up during a design review today: the designs don't make it clear that the install script command will be different depending on OS, and whether this should be pre-filled. @lukeheath is going to discuss with @georgekarrv and make a final decision about what to do for this UI.
Since pkg is only for macOS,exe and msi is only windows and deb is only Debian Linux there isn't an issue here currently
Discussed with Luke that one point to make it more obvious that it's edit-able will be changing the text input to one that matches closer to the query input below (lined numbers w/ wrap etc) and changing the word command
in the underlying text to script
and autofilling it with
#!/bin/sh
installer -pkg ./FalconSensor-6.44.pkg -target /
The newline and she-bang will make it more obvious w/o any other words that this is a script, you can add more or less and the line numbers will make it more clear that you can edit it.
We decided to cut the ability to specify install_script
, pre_install_query
, and post_install_script
inline:
Why?
- Gitops flow replaces variables that start with
$
in YAML with GitHub env variables - We’ll need to escape all variables defined by us and the IT admin
- Currently not possible to escape until this is fixed https://github.com/fleetdm/fleet/issues/18467
We also decided to add a path
sub-key to install_script
. Like we’re adding for pre_install_query
and post_install_script
.
This way, the interface is consistent and we’re setting ourselves up for specifying these inline later.
The CLI wireframes in Figma are updated to reflect this.
cc @roperzh
Summary of the discussion and decisions during today's MDM standup
Blockers
The follow are no longer blockers. I removed them from the issue description.
- [ ] TODO: Determine the three commands. Maybe 4 because different flavors of unix. Understand their interfdace, how they are called. Document this in the wireframes.
- [ ] TODO: Luke will work with team to sort out what to do about the read only vs editable aspects, erring on the side of a 2-way door if possible (not a 1-way door that's hard to migrate out of). In other words, make it so that you can edit less-- you can always make it more customizable, but hard to go the other direction. If necessary/appropriate/soundest, cut the ability to configure even CLI opts.
Sort on Host details (My device) > Software table
@ghernandez345 and @mna we'd like to update the default sort on the Software table on the Host details and My device pages to name ascending.
Name column to be sortable. In addition, we can cut the special sort for the Install status column. Figma is updated:
Why? The Fleet UI convention is to have a default sort that is always visible on the page. This is feedback from Mike that product design forgot to relay to y'all. Sorry about the last minute heads up.
Add software modal
@ghernandez345 we decided to use the ace editor and update the help text for the Install script input field. Figma is updated:
On the Advanced options modal (for uploaded software) let's remove for the Install script input field:
Heads up @roperzh, we want to add the #!/bin/sh
shebang to the default pkg script so that it's obvious that this is a shell script.
FYI @lukeheath, @ghernandez345 we decided that Gabe will be taking on these UI changes.
Disabling deploy security agents if scripts are disable
We decided to allow deploying software. We can follow up in a later pass to add an off switch for deploying software.
@roperzh just checking, if a host has the fleetd agent w/ scripts disabled, will that prevent the ability to deploy software (and run an arbitrary script) on a host?
Providing a custom install script
We decided to go w/ the environment variable approach. When customizing the install script, the IT admin will specify the software's location as an environment variable in their script. Fleet will populate this variable for them.
installer -pkg "$INSTALLER_PATH" -target /
cc @georgekarrv
@noahtalerman Thanks; I agree UI changes should go to @ghernandez345.