grype icon indicating copy to clipboard operation
grype copied to clipboard

Inconsistency in Grype's template processing data models - not the same as the json output format

Open josmartin opened this issue 2 years ago • 6 comments

What happened: I was using the template output mode from grype and read on the main documentation page that ...

quote Grype's template processing uses the same data models as the json output format — so if you're wondering what data is available as you author a template, you can use the output from grype -o json as a reference. /quote

So I had previously exported the full scan result from my container and noted that the path to the artifact being scanned would be artifact.locations[range]/path (where locations is an array of object).

Thus I wrote a template file that had the following variable definition in it ...

{{(index .Artifact.Locations 0).Path}}

This threw the following error

ERROR unable to show grype-vulnerability-scanning-finished event: unable to show vulnerability report: unable to execute supplied 
template: template: scan.template:3:128: executing "scan.template" at <0>: can't evaluate field Path in type source.Coordinates

After some debugging I discovered that the actual name of the data should have been RealPath (since a Location object is apparently defined as a RealPath and a Layer)

What you expected to happen: I expected the template data model to exactly match the output from grype -o json however it appears there are differences.

How to reproduce it (as minimally and precisely as possible): I'll assume you have some SBOM from some container as a reference - this SBOM must contain some vulnerability. This SBOM is called sbom.json

  1. Create a template file called scan.template with content
 Package , Version Installed , Vulnerability ID , Severity, Found In
{{- range .Matches}}
{{.Artifact.Name}} , {{.Artifact.Version}} , {{.Vulnerability.ID}} , {{.Vulnerability.Severity}}, {{(index .Artifact.Locations 0).Path}}
{{- end}}
  1. grype sbom:sbom.json -o json and note that the data model says there is a Artifact.Locations.Path in the data model
  2. Now run grype sbom:sbom.json -o scan.template and see that it fails
  3. Convert the Path to RealPath in the template and note that it now works

Anything else we need to know?:

Environment:

  • Output of grype version:
$ /opt/grype/grype version
Application:          grype
Version:              0.51.0
Syft Version:         v0.59.0
BuildDate:            2022-10-17T23:56:55Z
GitCommit:            4cda526992d5003dcbab68c9a7479a653dfde008
GitDescription:       v0.51.0
Platform:             linux/amd64
GoVersion:            go1.18.7
Compiler:             gc
Supported DB Schema:  5
  • OS (e.g: cat /etc/os-release or similar):
$ cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 10 (buster)"
NAME="Debian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

josmartin avatar Oct 21 '22 12:10 josmartin

@josmartin -- you're definitely right, there are some inconsistencies here, as you pointed out the structure of the Coordinates is:

type Coordinates struct {
	RealPath     string `json:"path"`
	FileSystemID string `json:"layerID"`
}

... so you'd need to use RealPath instead of Path, etc..

I wonder if we added a command that only printed out the current data model, or even possibly another output format like json-template, which used all the field names exactly as they would appear in the templates, would that be a viable solution?

kzantow avatar Oct 21 '22 15:10 kzantow

@kzantow Thanks for the confirmation and quick response.

The below is my analysis - please interpret it understanding that I really don't know about the general requirements of Grype (I'm just a software developer who happens to use Grype at the moment) and its design / user facing expectations.

To your question ... as a user of Grype and templating think I simply want something that tells me what the data model is.

However from a software development perspective it was awesome that the template data model was the same as that of the json output, since anything I would have learned in perusing / using the json output could be fed back into my understanding. In addition, by guaranteeing that the data model for the template was the same as that for the json output you allowed me to re-use any templates that I made for Grype in any other go code that subsequently interpreted the json output of Grype. This pipeline behaviour is (in my opinion) something good.

So - should you introduce a json-template output format or make the template data model match the json output? The software developer in me argues that if you introduce json-template as well as json it's

  • more stuff to maintain
  • confusing for a user to decide which one (json or json-template) they need
  • impossible for a user to know which one someone else might want, and they might need to provide both if they have 2 subsequent users who need both

But I agree that my problem (asking the question "What is the template data model") would be solved by a json-template output.

I think the decision must come down to the overall expectations of both templating and json output since both are ultimately affected by this choice. Hope that is vaguely helpful!

josmartin avatar Oct 22 '22 10:10 josmartin

Hi @josmartin, sorry I missed this for some reason. I talked with part of the team about this and we haven't come up with a "great" solution here. Everything we've thought of so far just isn't great:

  • Updating all JSON tags to match the structure of the Go structs:
    • might break existing users
    • might not be the JSON structure we want
    • the Go structs might change structure over time but we keep the same JSON structure
    • still doesn't match the Go directly (need to uppercase the names)
  • Adding a json-template (maybe could be called go-struct or something similar)
    • shouldn't be necessary
  • Include the template structure as documentation
    • requires a user to bounce between 3 different things (template, values, docs) to figure out what to do -- even then, might not be clear
    • need to keep this up to date with every release (which we could by generate the docs)

Any thoughts you have on what would help you the most?

kzantow avatar Oct 27 '22 20:10 kzantow

Discussion for dev team: probably the easiest thing to do is refactor the model field names to match the JSON struct tags.

tgerla avatar Aug 31 '23 20:08 tgerla

Here's another idea, not sure if it's a good one:

What if we run the templates in the context of the JSON output? In other words, when we execute a template, generate the JSON doc, and then unmarshal it as a map[string]any, and pass that map into the template? That way, the structure of the JSON and the structure of the fields available in the template always and automatically match. Here's a playground demonstrating on a toy JSON document: https://go.dev/play/p/kTTAzn-eS9x

This would break existing templates, so we need to be careful about how to roll it out; maybe a flag that turns on the behavior and a warning that it will be default in 1.0?

There will be a performance hit from marshalling and unmarshalling the JSON, but it would let us have JSON naming on the output and Go naming on the structs, which might be worth it.

willmurphyscode avatar Sep 07 '23 11:09 willmurphyscode

I like that idea @willmurphyscode -- I don't think the performance would be a concern (it could be, but we're doing stuff like this across Syft and Grype already without issue). The especially nice thing about this is a user would type exactly what the JSON has, no need to uppercase fields and such.

kzantow avatar Sep 07 '23 12:09 kzantow