grype icon indicating copy to clipboard operation
grype copied to clipboard

Syft-generated Windows SBOM unable to identify format

Open jijames opened this issue 3 years ago • 7 comments

What happened: json-formatted SBOM generated using syft on Windows produced an "unable to identify format" error when loaded into grype.

./grype sbom:windows-sbom.json 
 ✔ Vulnerability DB        [updated]
1 error occurred:
	* failed to catalog: unable to decode sbom: unable to identify format

What you expected to happen: Grype successfully loads and analyzes the Windows SBOM created by Windows syft.

How to reproduce it (as minimally and precisely as possible):

  1. Produce an SBOM from directory scan on Windows: .\syft.exe dir:"C:\Program Files" -o json > X:\windows-sbom.json
  2. Load the Windows SBOM: .\grype.exe sbom:X:\windows-sbom.json

Anything else we need to know?:

  • Tested with both the Windows (release) and Linux (manual) build of grype. Using the syft windows release build.
  • syft output of a Linux system correctly loaded with the same setup.
  • Example syft output attached. windows-sbom.json.zip

Environment:

  • Output of grype version:

    • Linux ./grype version Application: grype Version: 0.27.3 Syft Version: v0.33.0 BuildDate: 2021-12-16T15:11:16Z GitCommit: 4f964c4ee26ad01a80b8bcffb6bf23c0afb71d09 GitTreeState: clean Platform: linux/amd64 GoVersion: go1.16.12 Compiler: gc Supported DB Schema: 3

    • Windows grype .\grype.exe version Application: grype Version: 0.27.3 Syft Version: v0.33.0 BuildDate: 2021-12-16T15:11:16Z GitCommit: 4f964c4ee26ad01a80b8bcffb6bf23c0afb71d09 GitTreeState: clean Platform: windows/amd64 GoVersion: go1.16.12 Compiler: gc Supported DB Schema: 3

    • Windows syft .\syft.exe version Application: syft Version: 0.33.0 BuildDate: 2021-12-16T14:16:33Z GitCommit: a27907659d6c43f5bef2f76a9e381f2bf697a64f GitTreeState: clean Platform: windows/amd64 GoVersion: go1.16.12 Compiler: gc

  • OS (e.g: cat /etc/os-release or similar): Linux Mint 20.2 / Ubuntu Focal Windows 11

jijames avatar Dec 22 '21 16:12 jijames

It looks like validation fails around syftjson validator.go line 20: https://github.com/anchore/syft/blob/b7979dbc7d62c195a662c022c0b8c4ecbc069958/internal/formats/syftjson/validator.go#L20

jijames avatar Dec 22 '21 16:12 jijames

Interesting... I looked at the attached SBOM, and it looks like the JSON data is preceded by a two-byte byte order mark (0xFFFE).

This might be something we need to account for in our format validation/decoding...

luhring avatar Dec 22 '21 18:12 luhring

Tried removing the first two bytes (0xfffe) - did not resolve the issue.

Might be an encoding issue. This is from a Linux scan: 00000000: 7b0a 2022 6172 7469 6661 6374 7322 3a20 {. "artifacts":

This is from the windows scan (first two bytes removed):

00000000: 7b00 0d00 0a00 2000 2200 6100 7200 7400  {..... .".a.r.t.
00000010: 6900 6600 6100 6300 7400 7300 2200 3a00  i.f.a.c.t.s.".:.
file -i linux-sbom.json 
tsurugi-sbom.json: text/plain; charset=utf-8
file -i windows-sbom.json
windows-sbom.json: text/plain; charset=utf-16le

Convert utf-16le to utf-8 iconv -f UTF-16LE -t UTF-8 windows-sbom2.json -o windows-sbom3.json Remove first three bytes (after conversion) dd if=windows-sbom3.json of=windows-sbom4.json ibs=2 skip=1 Header result 00000000: 7b20 2261 7274 6966 6163 7473 223a 205b { "artifacts": [ Grype loads successfully

./grype sbom:~/Desktop/SharedVM/windows-sbom4.json
 ✔ Vulnerability DB        [no update available]
New version of grype is available: 0.28.0
 ✔ Scanned image           [0 vulnerabilities]
No vulnerabilities found

Likely solution

  1. Support utf-16le OR force utf-8 (maybe because I used a pipe?)
  2. Accept first two bytes 0xFFFE OR three bytes after conversion 0xEFBBBF - valid starts with 0x7B 00000000: **efbb bf**7b 2022 6172 7469 6661 6374 7322 ...{ "artifacts"

Attached all conversions for reference: window-sboms.zip

jijames avatar Dec 23 '21 02:12 jijames

The issue seems due to using redirect (>) in Windows 11.

Redirect will result in a UTF-16LE encoded file (not accepted by grype): .\syft.exe dir:"C:\Program Files\Go" -o json > windows-sbom.json The grype instructions recommend redirect.

--file option will result in a US-ASCII encoded file (that is accepted by grype): .\syft.exe dir:"C:\Program Files\Go" --file windows-sbom.json -o json

00000000: 7b0a 2022 6172 7469 6661 6374 7322 3a20 {. "artifacts":

Example of file option output attached for reference. windows-file-opt.zip

jijames avatar Dec 23 '21 03:12 jijames

@jijames This is fascinating, thanks for the thorough investigation! So if I'm understanding the request here, it's that Grype be able to read SBOM files encoded as UTF-16LE?

luhring avatar Feb 02 '22 21:02 luhring

The problem can be generalized to "read SBOM from piped windows output." In my case that produced UTF-16LE, but I don't know if that's always the case for Windows 11 EN.

Note that specifying the output file in syft resulted in a usable file. But Syft has other Windows related... challenges not related to grype.

-------- Original Message -------- On Feb 2, 2022, 15:57, Dan Luhring wrote:

@.***(https://github.com/jijames) This is fascinating, thanks for the thorough investigation! So if I'm understanding the request here, it's that Grype be able to read SBOM files encoded as UTF-16LE?

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>

jijames avatar Feb 02 '22 22:02 jijames

If anyone has some experience with Windows and text encoding in Go, this might be a great first issue to work on!

tgerla avatar Aug 10 '23 20:08 tgerla