fleet icon indicating copy to clipboard operation
fleet copied to clipboard

Host vitals: add softwares' hash (SHA-256) for macOS apps

Open noahtalerman opened this issue 10 months ago • 11 comments

Goal

User story
As a Security Engineer,
I want to see the SHA-256 hash for each application executable on a macOS host
so that I can match against, and in some cases block, that application by hash in Santa.

Key result

Customer request

Original requests

  • #24210

Context

  • Product designer: @noahtalerman

Changes

Product

  • [ ] UI changes: Figma here
  • [ ] CLI (fleetctl) usage changes: No changes
  • [ ] YAML changes: No changes
  • [ ] REST API changes: PR here
  • [ ] Fleet's agent (fleetd) changes: Add a cdhash_sha256 column to the codesign fleetd table. Looks like we would switch the codesign call from --verbose to -vvv when hash is requested, and revise parsing to extract both team identifier and hash.
  • [ ] Activity changes: No changes
  • [ ] Permissions changes: No changes
  • [ ] Changes to paid features or tiers: Fleet Free and Fleet Premium
  • [ ] Transparency changes: No changes
  • [x] First draft of test plan added
  • [ ] Other reference documentation changes: No changes
  • [ ] Once shipped, requester has been notified
  • [ ] Once shipped, dogfooding issue has been filed

Engineering

  • [ ] Add another variation to the macOS software vitals query that, if the column exists in codesign, includes it. Otherwise fall back to the other two variants (one with codesign available, one without).
  • [ ] Confirm that the above software vitals query doesn't get denylisted, even on machines with a bunch of software.
  • [ ] Test plan is finalized
  • [ ] Database schema migrations: Add executable_sha256 column to host_software_installed_paths
  • [ ] Load testing: Yes. Need to confirm that software ingestion load isn't significantly affected by inclusion of hash, including when we're backfilling hashes (similar to the effects when we backfilled team identifier).

ℹ️  Please read this issue carefully and understand it. Pay special attention to UI wireframes, especially "dev notes".

QA

Risk assessment

  • Requires load testing: Yes
  • Risk level: High
  • Risk description: Need to make sure that vitals continue to work (without hashes) on vanilla osquery and older versions of fleetd/fleetd-tables, and need to make sure vitals queries don't get denylisted

Test plan

  • [ ] UI: Head to a macOS hosts Host details page and, on the Software tab, select Actions > Show details for a macOS app ("source": "apps" in the GET /hosts/:id/software API). Verify that the hash is presented for each version of the app installed. Confirm that the hash for the associated install path matches what we get from codesign.
  • [ ] API: Hit the GET /hosts/:id/software API and verify that the new hash_sha256 is included under installed_versions.signature_information array for each macOS app ("source": "apps")
  • [ ] Confirm that software vitals work properly for a vanilla osquery host (with no hash)
  • [ ] Confirm that software vitals work properly for non-cask Homebrew packages (with no hash)
  • [ ] Confirm that software vitals work properly for Linux and Windows hosts (with no hash)
  • [ ] Confirm that software vitals work properly for a vanilla osquery host with the fleetd tables extension (WITH hash for apps)
  • [ ] Confirm that software vitals work properly for older fleetd on macOS (with no hash)

Testing notes

Confirmation

  1. [ ] Engineer: Added comment to user story confirming successful completion of test plan.
  2. [ ] QA: Added comment to user story confirming successful completion of test plan.

noahtalerman avatar Jan 17 '25 15:01 noahtalerman

@noahtalerman any chance we can still get these designs done in this sprint?

zayhanlon avatar Apr 03 '25 16:04 zayhanlon

FYI @zayhanlon removed P2 from this because we want to reserve priority labels for urgent business needs.

In this case, we'll bring this story through expedited drafting, get it estimated, and plan on bringing it into the next sprint.

I think I suggested putting a P2 on it. Whoops from me.

noahtalerman avatar Apr 09 '25 21:04 noahtalerman

Yup lol, ok no worries @noahtalerman

zayhanlon avatar Apr 09 '25 21:04 zayhanlon

Just did some research on this and my guess is that the current framing of this ticket expects the hash value to be pulled from the hash table, which includes various hash types including sha256. Catch there is that that table will hash each file on the fly, which would basically guarantee low performance on the host while inventorying, let alone getting the query denylisted.

By contrast, the signature table is lighter-weight from my understanding, and with osquery 5.15 (specifically https://github.com/osquery/osquery/pull/8471) we can have that table only pull executable hash for better (potentially acceptable), which I believe is sufficient here. The apps part of that query would look something like this:

SELECT
  name AS name,
  cdhash AS hash,
  COALESCE(NULLIF(bundle_short_version, ''), bundle_version) AS version,
  bundle_identifier AS bundle_identifier,
  '' AS extension_id,
  '' AS browser,
  'apps' AS source,
  '' AS vendor,
  last_opened_time AS last_opened_at,
  apps.path AS installed_path
FROM apps
LEFT JOIN signature ON signature.path = apps.path AND signature.hash_executable = FALSE AND signature.hash_resources = FALSE;

(thx @allenhouchins for your machine being the target of the above live query, and @lucasmrod for the correction on the query)

The catch is that this hash is sha1 rather than sha256, so wouldn't work on this as spec'd. If we wanted the beefier hash we'd need to either modify osquery again or add columns to the codesign table that effectively port the hashing logic from C++ to Go, but with sha256 rather than sha1. So if sha1 is sufficient for the customer (e.g. if they're already using that table internally and just want it rolled up into software inventory) we should use that.

More context: https://github.com/fleetdm/confidential/issues/8750#issuecomment-2528257909

iansltx avatar Apr 10 '25 13:04 iansltx

If cached sha1 is acceptable here, performance will be acceptable with that table, and we'll add another discovery query to make sure we're only pulling hash when we can do it efficiently. Given that this has been in osquery for a couple releases, I don't think we have to backport a more expensive version of this.

iansltx avatar Apr 10 '25 14:04 iansltx

Confirmed with the customer that they need sha256 rather than sha1. Updated the story to take this into account. Actually looks like the codesign command supports this, and we already use it in the codesign table, so as long as performance is reasonable this shouldn't be as large of a lift as I thought for changes that will require fleetd work.

Sending out estimates to backend folks for this as a full-stack task since the frontend changes for this are trivial.

iansltx avatar Apr 11 '25 04:04 iansltx

Hey team! Please add your planning poker estimate with Zenhub @jahzielv @ksykulev

iansltx avatar Apr 11 '25 04:04 iansltx

How do we plan to test this at scale, since osquery-perf doesn't actually run queries?

jahzielv avatar Apr 11 '25 11:04 jahzielv

@jahzielv We'll need to tweak osquery-perf to include hashes on software vitals for the apps table, potentially behind a flag so we can simulate fleetd or vanilla. We don't need to test the real queries at scale, but would need to test on a few real machines to make sure swapping codesign usages doesn't get us denylisted.

iansltx avatar Apr 11 '25 14:04 iansltx

sounds like we may need to test this query (via live query?) in order to determine if the software query will get denylisted. If there are problems, this estimate will likely increase greatly to accommodate for finding alternatives.

mostlikelee avatar Apr 11 '25 16:04 mostlikelee

@mostlikelee Yes, once we have the fleetd-tables changes (which look quick) are live. I'm cautiously optimistic but if we can't do this efficiently with the codesign utility this will wind up having the remaining points consumed with a research task figuring out what we could use instead.

iansltx avatar Apr 11 '25 16:04 iansltx

pushing this to the 4.69.0 release in order to increase test surface. We want to have a higher confidence that the sha queries do not get denylisted resulting in missing software hashes.

mostlikelee avatar May 09 '25 17:05 mostlikelee

holding until after 1.42 is released

mostlikelee avatar May 12 '25 15:05 mostlikelee

osquery uses a watchdog style system to dynamically add queries to the denylist. Here are the default limits at the time of posting this comment

https://github.com/osquery/osquery/blob/master/osquery/core/watcher.cpp#L74-L93

const WatchdogLimitMap kWatchdogLimits = {
    // Maximum MB worker can privately allocate.
    {WatchdogLimitType::MEMORY_LIMIT, {200, 100, 10000}},

    // % of (User + System) CPU time worker can utilize
    // for LATENCY_LIMIT seconds.
    {WatchdogLimitType::UTILIZATION_LIMIT, {10, 5, 100}},

    // Number of seconds the worker should run, else consider the exit fatal.
    {WatchdogLimitType::RESPAWN_LIMIT, {4, 4, 1000}},

    // If the worker respawns too quickly, backoff on creating additional.
    {WatchdogLimitType::RESPAWN_DELAY, {5, 5, 1}},

    // Seconds of tolerable UTILIZATION_LIMIT sustained latency.
    {WatchdogLimitType::LATENCY_LIMIT, {12, 6, 1000}},

    // How often to poll for performance limit violations.
    {WatchdogLimitType::INTERVAL, {3, 3, 3}},
};

I don't know if our clients override these or not, but my guess is we can probably test this with the default values.

ksykulev avatar May 14 '25 19:05 ksykulev

I did some performance testing and validation in the original PR https://github.com/fleetdm/fleet/pull/28574#issuecomment-2887288711 & https://github.com/fleetdm/fleet/pull/28574#issuecomment-2887817339). Will open a new PR this week to get the changes in.

ksykulev avatar May 17 '25 17:05 ksykulev

For future changes to codesign and performance testing, here is the script I used to generate 1000 apps in /Applications

#!/bin/bash

# Number of fake apps to generate
NUM_APPS=1000
TARGET_DIR="/Applications"

if [[ $EUID -ne 0 ]]; then
  echo "Please run as root: sudo $0"
  exit 1
fi

echo "Generating and signing $NUM_APPS fake applications in $TARGET_DIR..."

for i in $(seq 1 $NUM_APPS); do
  APP_NAME="FakeApp${i}.app"
  APP_PATH="${TARGET_DIR}/${APP_NAME}"
  BIN_PATH="${APP_PATH}/Contents/MacOS/FakeApp${i}"

  # App bundle structure
  mkdir -p "${APP_PATH}/Contents/MacOS"
  mkdir -p "${APP_PATH}/Contents/Resources"

  # Dummy binary
  echo -e "#!/bin/bash\necho \"Running FakeApp${i}\"" > "$BIN_PATH"
  chmod +x "$BIN_PATH"

  # Info.plist
  cat > "${APP_PATH}/Contents/Info.plist" <<EOF
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
 "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
  <key>CFBundleName</key>
  <string>FakeApp${i}</string>
  <key>CFBundleIdentifier</key>
  <string>com.fake.FakeApp${i}</string>
  <key>CFBundleExecutable</key>
  <string>FakeApp${i}</string>
  <key>CFBundlePackageType</key>
  <string>APPL</string>
  <key>CFBundleVersion</key>
  <string>1.0</string>
</dict>
</plist>
EOF

  # Sign the app with an ad-hoc identity
  codesign --force --sign - --deep --timestamp=none "${APP_PATH}" > /dev/null 2>&1

done

echo "Done. $NUM_APPS signed fake apps created in $TARGET_DIR."

ksykulev avatar May 17 '25 17:05 ksykulev

@ksykulev I am following the test steps but I'm not seeing the hash value for this software:

Image

Am I looking in the wrong place?

Note: I am using fleetd 1.42

jmwatts avatar May 22 '25 15:05 jmwatts

@jmwatts You'll need to use local TUF for this and build a new fleetd.

iansltx avatar May 22 '25 15:05 iansltx

@iansltx should this be tagged with fleetd 1.43.0 or is there a separate ticket for that?

jmwatts avatar May 22 '25 15:05 jmwatts

@jmwatts This should be split into detail query (4.69) and fleetd (1.43) subtasks.

iansltx avatar May 22 '25 16:05 iansltx

@iansltx I used local TUF to build a new fleetd but I'm still not seeing it show up in the Host >> Software >> Show details.

I can see the new column when I run a query on the codesign table:

Image

So it seems like I am able to access that value, but it's not populating in the UI:

Image

jmwatts avatar May 22 '25 19:05 jmwatts

@jmwatts Do we get the response for this in the API? Trying to figure out what all we still need to build here; may be missing a piece here.

iansltx avatar May 22 '25 19:05 iansltx

Image

Yep, it's in the API response

jmwatts avatar May 22 '25 19:05 jmwatts

QA Notes

UI: Head to a macOS hosts Host details page and, on the Software tab, select Actions > Show details for a macOS app ("source": "apps" in the GET /hosts/:id/software API). Verify that the hash is presented for each version of the app installed. - [x] Confirm that the hash for the associated install path matches what we get from codesign.

  • [x] API: Hit the GET /hosts/:id/software API and verify that the new hash_sha256 is included under installed_versions.signature_information array for each macOS app ("source": "apps")

  • [x] Confirm that software vitals work properly for a vanilla osquery host (with no hash)

  • [x] Confirm that software vitals work properly for non-cask Homebrew packages (with no hash)

  • [x] Confirm that software vitals work properly for Linux and Windows hosts (with no hash)

  • [x] Confirm that software vitals work properly for a vanilla osquery host with the fleetd tables extension (WITH hash for apps)

  • [x] Confirm that software vitals work properly for 1.42.0 fleetd on macOS (with no hash)

NOTE: Vitals refetch is working for the above scenarios, but viewing the software details is broken #29513

jmwatts avatar May 28 '25 16:05 jmwatts

@noahtalerman @eugkuo @lukeheath @zayhanlon Found an issue that we did not account for multiple install paths. Moving this to expedited drafting to redesign the UI to account for this.

mostlikelee avatar May 29 '25 14:05 mostlikelee

@mostlikelee can you outline the issue in the ticket and what design needs are required so that we can address them?

cc @RachelElysia

eugkuo avatar May 29 '25 14:05 eugkuo

@eugkuo Short version: Hash is per installed path, not per version; each installed path can have a different hash. Admins will want to be able to copy this hash so they can paste it into other security tooling.

iansltx avatar May 29 '25 14:05 iansltx

@iansltx @RachelElysia @mostlikelee

Okee. Talked this over with @mostlikelee and have updated the design to reflect multiple paths and shas within a single version.

eugkuo avatar May 29 '25 17:05 eugkuo

@jmwatts since this is going through expedited drafting, can you review the design changes to make sure it makes sense?

mostlikelee avatar May 29 '25 20:05 mostlikelee

@mostlikelee Aesthetically: The new design looks... cluttered to me. I think it's because the values are in-line with the headers. It's also missing an example of software with associated install info. Screenshots for comparison:

Now:

Image

New design:

Image

Now with tabs, includes install details:

Image

Functionally: Will it scroll within the modal if somehow there are tons of installed versions? Or will it just be a modal that gets longer the more versions there are? Other than that, it "makes sense".

jmwatts avatar May 29 '25 20:05 jmwatts