purl-spec icon indicating copy to clipboard operation
purl-spec copied to clipboard

Clarify whether the `type` should be required to be a "known" type or whether it can be an arbitrary field

Open jamietanna opened this issue 1 year ago • 4 comments

I've been playing around with GitHub's Dependency Submission API, which consumes pURL(s).

However, as noted here and in https://github.com/anchore/syft/issues/1622, it appears that GitHub's support for pURL type fields has some validation, looking for "known" pURLs, rather than allowing arbitrary types.

For instance, if we use packageurl-go, the pURL pkg:mix/req@~%3E%200.3 parses correctly:

package main

import (
	"fmt"
	"log"

	"github.com/package-url/packageurl-go"
)

func main() {
	s := "pkg:mix/req@~%3E%200.3"
	fmt.Printf("s: %v\n", s)
	p, err := packageurl.FromString(s)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("p.Name: %v\n", p.Name)
	fmt.Printf("p.Version: %v\n", p.Version)
}

Clarification on this issue would allow raising this to GitHub as a defect, if it is one.

jamietanna avatar Jan 15 '24 11:01 jamietanna

~= 0.3 is a version requirement, not a version number. Most PURL parsers should be able to parse this PURL correctly, but it's conceptually invalid because there is no package with that version. See also #139. I'm pretty sure you mean pkg:hex/[email protected].

If you're generating valid PURLs that aren't known types, assuming the other PURL implementations are correct, there are four possible outcomes:

  1. The tool passes the PURL through unchanged.
  2. The tool parses the PURL and rejects it because it needs to understand but doesn't.
  3. The tool happens to have the same type, but with an incompatible interpretation and produces unexpected results.
  4. The tool happens to have the same type with a compatible interpretation and works as expected.

None of these outcomes would be a bug in the PURL implementation, but 3 and 4 are dangerous for unknown types because the correct behavior is unspecified. PURL cannot mandate how all implementations are going to behave when encountering unexpected inputs, and unexpected inputs should be expected because new types are added over time. The correct behavior depends on what is being done with the PURLs.

I think only outcome 3 would ever be invalid when dealing with any known types because it's not feasible for every implementation to support every package type equally. If GitHub wants to resolve the PURL to a version of a package and then the version of a package to a GitHub repository, that can only be done with a subset of the known types.

matt-phylum avatar Jan 15 '24 14:01 matt-phylum

Thanks Matt.

Yes that's correct - that was a mistake on my part.

However, I do see the same issue with:

invalid package url: in manifest "github/dagger/dagger" decoding "pkg:mix/[email protected]": invalid package url type: mix

That's interesting, thanks for sharing!

So I guess in this case, I should be aiming to make sure that as a first step, I should be mapping the type to a known type, if possible.

And then in the case that I can't, GitHub are within their rights to reject any types that they don't support?

jamietanna avatar Jan 15 '24 14:01 jamietanna

Yeah. I think it would be appropriate for GitHub to document what happens if you pass in a type that they don't support, but for the PURL spec to specify that all implementations must reject PURLs that aren't understood would cause problems for tools that process SBOMs, and for the PURL spec to specify that all implementations must accept PURLs that aren't understood would cause problems for tools that need to be able to resolve PURLs.

matt-phylum avatar Jan 15 '24 15:01 matt-phylum

Let's document this. I think there is no reason to reject new types with the caveat that they may be harmless unless properly documented in the types spec doc.

pombredanne avatar Mar 08 '24 15:03 pombredanne

FYI GitHub has recently relaxed their strict validation on pURL types

jamietanna avatar Apr 13 '25 09:04 jamietanna