security-insights-spec icon indicating copy to clipboard operation
security-insights-spec copied to clipboard

[Docs] Improve documentation on intended usage of `header.url`

Open trumant opened this issue 11 months ago • 7 comments

Overview

https://github.com/search?q=language%3AYAML+path%3Asecurity-insights+%22schema-version%3A+2.0.0%22&type=code shows me 24 insights files using the v2 schema.

Reviewing the usage of header.url across those samples, I observed that the majority of these projects are using the field to store their project URL value, i.e. values like: https://carabiner.dev/projects/bnd and https://github.com/open-telemetry and the remainder used the field to store a URL pointing to an insights file.

Existing documentation and what it may or may not mean

The primary reference URL for this schema’s origin or repository.

  • As the type of the field defined in the schema is URL and the doc mentions a URL, its clear that it should hold a valid https/TLS URL
  • "reference" isn't adding much here, perhaps we drop it
  • "this schema" - is the intention here self reference? An instance of the schema is a security-insights.yml file and therefore the URL value is meant to refer to the schema defining the data present in the file? If that's the case, then should the doc say something like "The URL of the schema used by your project's security-insights.yml file, typically https://github.com/ossf/security-insights-spec/releases/download/v2.0.0/schema.cue or https://github.com/ossf/security-insights-spec/blob/v2.0.0/schema.cue"

How do we know the issue has been resolved?

  • [ ] we improve our documentation and examples to provide more clear guidance on appropriate values for the field
  • [ ] we observe changes to the values of header.url across existing v2 security-insights.yml in the community

trumant avatar Apr 10 '25 13:04 trumant

@funnelfiasco @jmeridth @jkjell any opinions here on how we can tighten up our intended usage here?

trumant avatar Apr 16 '25 03:04 trumant

Agreed that the phrasing is wonky here, thanks for catching.

IIRC, the intent is to avoid situations where an SI file is extracted from its source and it loses value due to missing context.

I'm not sure we have ever or could ever encounter that problem, but I have heard that as a pain point for SBOM handling, when SBOMs are pulled out and sent to analysis tools without context.

eddie-knight avatar Apr 16 '25 12:04 eddie-knight

I think I would personally prefer dropping repository. Storing the SI file in something like Guac, Archivista, an ecosystem package repository (NPM, PyPI, RubyGems, etc) or a generic artifact repository makes sense (and I think it being actively thought about by folks like @mlieberman85). It also allow establishing trust in the SI file separate from trust in the source code -- the URL could point to something like TUF metadata stored via a project like RSTUF.

Additionally, repo URLs are covered: https://github.com/ossf/security-insights-spec/blob/main/specification/project.md#projectrepositories

jkjell avatar Apr 16 '25 13:04 jkjell

I think I would personally prefer dropping repository.

I'm a bit confused by this part. Did you mean to suggest dropping header.url or the entire repository object?

eddie-knight avatar Apr 16 '25 13:04 eddie-knight

@eddie-knight sorry. I just meant from the definition of header.url:

The primary reference URL for this schema’s origin or repository.

to something like 👇 (left repository portion struck out for the part I was referring to):

The URL for this Security Insights' original location ~~or repository~~.

jkjell avatar Apr 16 '25 13:04 jkjell

Here I go again suggesting breaking changes. header.url is self-evidently the project's URL and no amount of documentation will get people to reliably use it as intended.

In an ideal world where we we don't have to worry about breaking changes, I'd suggest renaming the field to header.si-url, along with a wording change to the docs:

The original URL for this Security Insights file.

(I'm inspired by @jkjell's suggestion, but I tweaked it a bit since using "location" with "URL" feels redundant. I almost went with "canonical" or "authoritative" instead of "original", but that felt like I was just using big words for the sake of it. If it captures the intent more clearly, though, we should use one of those words)

funnelfiasco avatar Apr 16 '25 14:04 funnelfiasco

The next opportunity for breaking changes will be if we want to release a v3 at EOY. In the meantime, I'm in favor of the documentation suggestions here.

eddie-knight avatar Apr 16 '25 14:04 eddie-knight