fleet
fleet copied to clipboard
Uninstalling Fleet osquery (orbit) on Windows does not remove the installed files
Fleet version: Orbit 0.0.5
Operating system: VMWare Fusion Pro with Windows 10 Pro.
🧑💻 Expected behavior
Uninstalling Fleet osquery removes C:\Program Files\Orbit
directory.
💥 Actual behavior
After uninstalling Fleet osquery, the C:\Program Files\Orbit
directory is still there with all its contents untouched.
Nice catch!
@lucasmrod and @zwass, I added this issue to the 🚀 Release board on ZenHub to prepare the issue for assignment during tomorrow's standup.
Apparently, it might be a wix thing, here's some useful information: https://stackoverflow.com/questions/7532863/wix-not-removing-files-on-uninstall
We can come back to this if there is a more urgent need.
Oh, there is more urgent need. :) Lately, we have seen issues in our environment when VMs with orbit already installed and once started were cloned. The agents on the cloned VMs have the same id which is really bad from a fleet perspective: in the web UI hosts "disappear", i.e. are replaced by another host with the same id.
We run with the default configuration: the hardware id is used as the id. So, first we thought that the hardware id did not change during cloning. However, when we reinstalled the agent, the issue was just gone.
Caveat: reinstalling alone is not sufficient because "C:\Program Files\Orbit" remains, thus the id survives the reinstallation.
It would be great to have this fixed in the next fleet release. Thank you.
After a quick look: the uninstaller effectively removes some of the contents, specifically it removes the files/folders listed as part of the installer in orbit/pkg/packaging/windows_templates.go
. The files that remain after uninstalling are:
Mode LastWriteTime Length Name
---- ------------- ------ ----
d----- 8/2/2022 3:46 PM osquery.db
d----- 8/2/2022 2:17 PM osquery_log
-a---- 8/2/2022 12:31 PM 36 identifier
-a---- 8/2/2022 3:46 PM 4 osquery.pid
There are many solutions:
- We could stop writing those files in
Program Files
and use%LOCALAPPDATA%
instead (at least for the logs?) - We could explicitly remove some (or all) of the files via RemoveFile
- We could completely nuke the folder via
util:RemoveFolderEx
, for this we'd need to query the registry as described here - We could try to declare these files as part of the installation package along with the other files, although I don't know if this is entirely possible as they are created at runtime.
@chiiph Could you give any insight on this being fixed in the 4.19 release.
@chiiph Could you give any insight on this being fixed in the 4.19 release.
As in, whether it'll be prioritized for that release? I'll add it to today's prioritization call.
IMO "simplest"/least-risky solution to try out seems to be (3). (Solution (1) would require some refactor/migration.)
@chiiph @lucasmrod @xpkoala Is this an agent issue or a platform issue? Please update the labels accordingly.
I'd say #agent, so label should be ok.
note: I just faced an issue in a Windows VM related to this, I had Orbit with Osquery 5.4.0 installed, ran the uninstaller and installed Orbit with Osquery 5.5.1, afterwards Orbit failed to start with:
I0927 12:33:45.851253 7344 init.cpp:399] osquery initialized [version=5.5.1]
I0927 12:33:45.863526 7344 extensions.cpp:453] Could not autoload extensions: Cannot open file for reading: \Program Files\osquery\extensions.load
I0927 12:33:45.864913 7344 dispatcher.cpp:78] Adding new service: WatcherRunner (000002B90D2580A0) to thread: 7504 (000002B90D231CA0) in process 7928
E0927 12:33:45.916792 7504 shutdown.cpp:79] [Ref #1382] osqueryd has unsafe permissions: C:\Program Files\Orbit\bin\osqueryd\windows\stable\osqueryd.exe
I0927 12:33:45.935397 7344 dispatcher.cpp:149] Thread: 7344 requesting a stop
I0927 12:33:45.936741 7344 dispatcher.cpp:156] Service: 000002B90D2580A0 has been interrupted
I0927 12:33:45.936741 7344 dispatcher.cpp:122] Thread: 7344 requesting a join
I0927 12:33:45.939440 7344 dispatcher.cpp:140] Service thread: 000002B90D231CA0 has joined
I0927 12:33:45.939440 7344 dispatcher.cpp:144] Services and threads have been cleared
After removing C:\Program Files\Orbit
and trying again the problem was gone
@marcosd4h I still appear to have the aforementioned files (plus a few extra) in my c:\Program Files\Orbit
directory after uninstalling Fleet osquery. I was expecting this folder to be nuked? Should I be expecting the folder to be removed or did we decide to take another avenue for solving this issue?
@xpkoala thanks for testing this! That's correct, the folder c:\Program Files\Orbit
and its content has to be nuked. Can you share the windows version where you are testing this? Also, can you share the logs of msiexec uninstallation execution? You can grab the logs by running msiexec /x fleet-osquery.msi /quiet /passive /lv logtest.txt
. This is the command being used during Github CI tests here. Something else must be at play, as Orbit folder is nuked out on Github CI tests.
Microsoft Windows 10 Pro 10.0.19044 Build 19044
Logs provided via chat.
It seems that this issue reproduces when --insecure
flag is used on the installer. This flag causes osquery to take a long time to shut itself down due to a TLS timeout explained here.
When running msiexec /x
from the command line, it can be observed that osquery DB is locked (see screenshot below)
.
If the user waits for some seconds, and then click retry the issue is gone.
I'm exploring options for addressing this issue from the WiX installer, as it might require #8057 to be fixed first.
I've just pushed PR 8871 to handle this from the installer side. #8057 still needs to be investigated.
@xpkoala I've assigned the bug to you for issue QA.
Please share the install logs in case you find any issues, thanks!
Install: msiexec /i fleet-osquery.msi /quiet /passive /lv loginstall.txt
Uninstall: msiexec /x fleet-osquery.msi /quiet /passive /lv loguninstall.txt
@marcosd4h Still seeing remnants at c:\program files\orbit. In case it matters, fleet-dekstop was installed with the --insecure flag set.
Files remaining
- osquery.db (folder)
- osquery_log (folder)
- proxy (folder)
- identifier (file)
- secret-orbit-node-key.txt (file)
For what's worth it, all those are files created at runtime. (So that's probably why the current installer is not deleting them.)
Hey @xpkoala, thanks for looking into this! I cannot reproduce the issue even when using --insecure
flag. There is something I might be missing here.
Anyways, I recently added support for custom actions
to the MSI installer. This means that the installer is now able to execute complex logic through powershell at any point of the installer lifecycle. So, I've just added logic to stop Orbit and remove the files through a powershell custom action. The code is available on this branch here. Can you try this out to see if this fixes the issue in your environment?
Thanks!
@xpkoala I merged #9362 yesterday. Can you try reproducing the bug on your end using the latest on main? Thanks!
Uninstalling neat,
Files gone, no debris left,
Cleaning up the sky.