purl-spec
purl-spec copied to clipboard
Clarify whether the `type` should be required to be a "known" type or whether it can be an arbitrary field
I've been playing around with GitHub's Dependency Submission API, which consumes pURL(s).
However, as noted here and in https://github.com/anchore/syft/issues/1622, it appears that GitHub's support for pURL type fields has some validation, looking for "known" pURLs, rather than allowing arbitrary types.
For instance, if we use packageurl-go, the pURL pkg:mix/req@~%3E%200.3 parses correctly:
package main
import (
"fmt"
"log"
"github.com/package-url/packageurl-go"
)
func main() {
s := "pkg:mix/req@~%3E%200.3"
fmt.Printf("s: %v\n", s)
p, err := packageurl.FromString(s)
if err != nil {
log.Fatal(err)
}
fmt.Printf("p.Name: %v\n", p.Name)
fmt.Printf("p.Version: %v\n", p.Version)
}
Clarification on this issue would allow raising this to GitHub as a defect, if it is one.
~= 0.3 is a version requirement, not a version number. Most PURL parsers should be able to parse this PURL correctly, but it's conceptually invalid because there is no package with that version. See also #139. I'm pretty sure you mean pkg:hex/[email protected].
If you're generating valid PURLs that aren't known types, assuming the other PURL implementations are correct, there are four possible outcomes:
- The tool passes the PURL through unchanged.
- The tool parses the PURL and rejects it because it needs to understand but doesn't.
- The tool happens to have the same type, but with an incompatible interpretation and produces unexpected results.
- The tool happens to have the same type with a compatible interpretation and works as expected.
None of these outcomes would be a bug in the PURL implementation, but 3 and 4 are dangerous for unknown types because the correct behavior is unspecified. PURL cannot mandate how all implementations are going to behave when encountering unexpected inputs, and unexpected inputs should be expected because new types are added over time. The correct behavior depends on what is being done with the PURLs.
I think only outcome 3 would ever be invalid when dealing with any known types because it's not feasible for every implementation to support every package type equally. If GitHub wants to resolve the PURL to a version of a package and then the version of a package to a GitHub repository, that can only be done with a subset of the known types.
Thanks Matt.
Yes that's correct - that was a mistake on my part.
However, I do see the same issue with:
invalid package url: in manifest "github/dagger/dagger" decoding "pkg:mix/[email protected]": invalid package url type: mix
That's interesting, thanks for sharing!
So I guess in this case, I should be aiming to make sure that as a first step, I should be mapping the type to a known type, if possible.
And then in the case that I can't, GitHub are within their rights to reject any types that they don't support?
Yeah. I think it would be appropriate for GitHub to document what happens if you pass in a type that they don't support, but for the PURL spec to specify that all implementations must reject PURLs that aren't understood would cause problems for tools that process SBOMs, and for the PURL spec to specify that all implementations must accept PURLs that aren't understood would cause problems for tools that need to be able to resolve PURLs.
Let's document this. I think there is no reason to reject new types with the caveat that they may be harmless unless properly documented in the types spec doc.
FYI GitHub has recently relaxed their strict validation on pURL types