osv-schema icon indicating copy to clipboard operation
osv-schema copied to clipboard

Expand vuln id relationships

Open joshbuker opened this issue 3 years ago • 9 comments

Allow not just aliasing (this ID is also known as...) and relates to (this ID is different from, but related to...), but also more nuanced relationships like parent/child and such.

@JLLeitschuh what would be most useful relationship types to include from a researcher's perspective?

joshbuker avatar Jun 28 '22 19:06 joshbuker

Relationships off the top of my head:

  • Exists because of an insufficient fix for another ID
  • Similar to
  • Inspired by

JLLeitschuh avatar Jul 02 '22 23:07 JLLeitschuh

Maybe add also AND/OR/NOT relations, to describe dependencies among packages:

  • This software is vulnerable if the package is within a range AND is running on this OS OR this OS
  • This OS is vulnerable if package A OR package B are present, but only if certain patch/patches are NOT present

pereyra-m avatar Jul 29 '22 16:07 pereyra-m

Hi @chrisbloom7 ! I'm adding you here because I received a comment from you in another issue I opened, and I'd like to hear your opinion about this.

I'm using this issue instead of opening a new one because I think we're talking about the same, how can I express in OSV this parent/child relation named above? I suggest some kind of AND/OR/NOT combination.

For example, this vulnerability https://nvd.nist.gov/vuln/detail/CVE-2021-33500

CVE-2021-33500

Even when I properly described the range for putty (fixed in 0.75), the vulnerability only be exploitable if it's running on Windows. This creates a dependency that could also have another range (Windows from X to Y, etc.).

This situation can't be covered easily with the algorithm proposed in https://ossf.github.io/osv-schema/#evaluation.

pereyra-m avatar Aug 01 '22 15:08 pereyra-m

@oliverchang Bumping this as it affects GSD process for revoking duplicate IDs as the data gets normalized (How should duplicate IDs be handled?)

This wouldn't be a backward compatible change, but to be honest, related without a type association is fairly useless. It's unclear both to tooling and humans what the relationship actually means other than "go look at this thing and see if it matters to you".

joshbuker avatar Mar 29 '23 07:03 joshbuker

https://github.com/cloudsecurityalliance/gsd-tools/pull/197

joshbuker avatar Mar 29 '23 18:03 joshbuker

Could you talk a bit more about the use case for expanding on this?

It's not clear that this helps any automated tools in any way. If so, then is this just providing more details to humans? How will a human make use of this? Do these categories provide enough context such that they don't have to do a significant amount of work understanding the linked IDs anyway?

The categories proposed in #133 are also slightly confusing, in that there's overlap with aliases as well with the DUPLICATED* ones.

oliverchang avatar Mar 29 '23 22:03 oliverchang

The primary one of relevance is DUPLICATED_BY / DUPLICATE_OF, which is relevant for machines and humans, see: https://github.com/cloudsecurityalliance/gsd-tools/discussions/194 for context.

Presumably aliases are for...well, aliases. IDs that are referring to the exact same thing, as opposed to related IDs. for example, GSD-2021-44228 would have an alias of CVE-2021-44228. Or GHSA-5r3x-p7xx-x6q5 would have aliases of CVE-2023-28631 and GHSL-2023-049.

Whereas something that gets withdrawn and replaced with another would have no aliases, and a related entry of "DUPLICATE_OF" with the ID of the replacement ID.

EDIT: In retrospect, you could actually consolidate aliases into the related field, with a type of ALIAS.

joshbuker avatar Mar 29 '23 22:03 joshbuker

Was chatting with @kurtseifried while working on https://data.gsd.id and for relationships, would be valuable to have:

{
  "id": string (required)
  "type": string (required)
  "canonical_url": string (optional)
  "reference_urls": Array[string] (optional)
}

Format provided as an example / skeleton to work from.

joshbuker avatar Apr 03 '23 21:04 joshbuker

I think the type data should be an enum, e.g.:

DUPLICATE_OF DUPLICATED_BY CAUSED_BY (e.g. when they fix something and it triggers a new vuln) INCOMPLETE_FIX_FOR (e.g. shellshocks train of fixes) COMMON_NAME (e.g. #log4shell)

I suspect one is enough for almost all cases, the above options are very much exclusive to each other with the exception of COMMON_NAME.

kurtseifried avatar Apr 03 '23 21:04 kurtseifried