purl-spec icon indicating copy to clipboard operation
purl-spec copied to clipboard

Should we support a leading v in golang packages?

Open TG1999 opened this issue 1 year ago • 8 comments

If we look at go packages like these https://github.com/go-jose/go-jose/archive/refs/tags/v4.0.1.zip https://pkg.go.dev/github.com/go-jose/go-jose/v3?tab=versions, they have a leading v in them. Whereas if we look in osv.dev they are stored without any leading v https://osv.dev/vulnerability/GHSA-c5q2-7r4c-mv6g.

So how should we store this as a purl ?

pkg:golang/github.com/go-jose/go-jose/[email protected] or

pkg:golang/github.com/go-jose/go-jose/[email protected] ?

TG1999 avatar Mar 08 '24 09:03 TG1999

I think the v is part of the version in Go and needs to be present. https://github.com/golang/go/issues/32945 If purls were written without the v and then Go started doing something different, it would break all purl implementations.

matt-phylum avatar Mar 08 '24 13:03 matt-phylum

@TG1999 @matt-phylum go is a mess in this domain. :smiling_imp:

I'm inclined to accept the versions as they are with their v prefix, but then these are not the semver versions that go moduled promised anymore short of stripping the leading v.

So we need to agree on a canonical way (and document this preferred canonical way in the types doc) and also accept that unfortunately tools will have to deal with prefixed and unprefixed versions, and will need to strip the prefix to compare version properly in all cases and alo query some databases.

This is done here for instance https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#golang

pombredanne avatar Mar 08 '24 14:03 pombredanne

The Go section of the spec is in dire need of updates. The version and subpath stuff there implies that it's talking about Go packages and Go modules aren't supported. However, if the version specification were relaxed to a Git reference instead of a commit ID (truncated to an unspecified length), then the v must be included because the v is part of the tag name and is significant to Git. Commit IDs and other Git references are not versions and cannot be compared, but in Go if a tag begins with v and contains a valid version number then it is a comparable version.

It looks like since Go started using modules, the examples need to be updated:

  • pkg:golang/github.com/gorilla/[email protected]??????????????-234fd47e07d1 This example is currently invalid because the commit 234fd47e07d1004f0aed9c does not exist in the repository. The question marks are supposed to be the commit timestamp.
  • pkg:golang/google.golang.org/genproto/googleapis/api#annotations googleapis/api is part of the module name and cannot be in the subpath.
  • pkg:golang/github.com/gorilla/[email protected]??????????????-234fd47e07d1#api This example is invalid because the commit does not exist and neither does the specified subpath.

Proper examples with versions:

  • pkg:golang/[email protected] This example trips up implementations that incorrectly handle namespace+name. The namespace is optional because not all Go module names contain slashes.
  • pkg:golang/golang.org/x/[email protected]#context This example refers to a version that predates modules. It would have previously been given as something like pkg:golang/golang.org/x/net@cf3bd585ca2a#context.

matt-phylum avatar Mar 08 '24 15:03 matt-phylum

Another confusion for go: Is it all a name or does it have a namespace confusion for go

prabhu avatar Mar 21 '24 08:03 prabhu

The spec is clear that Go packages have PURL namespaces, even if the concept does not exist in Go. What's missing is that Go packages only sometimes have PURL namespaces because not all Go package IDs contain slashes.

matt-phylum avatar Mar 21 '24 12:03 matt-phylum

I guess the problem with Go (and NPM) packages is that even if your PURL implementation is correct, it's up to the application to correctly handle this namespace/name split and join translation, and users are unlikely to read the spec when they have the library to handle that for them. Maybe slashes in the names of Go packages should be forbidden to stop users from unknowingly doing the wrong thing and because of the way the names work it shouldn't be possible for slashes in the name to get some other meaning where they would need to be accepted later.

matt-phylum avatar Mar 21 '24 12:03 matt-phylum

@matt-phylum, we should fix the purl spec for go IMHO. Go was the only team with some reservations during the last IETF submission if my memory is correct.

prabhu avatar Mar 21 '24 15:03 prabhu

#204 might be the way to go. Combine the namespace and name into one value at the PURL level, don't encode slashes¹, and leave it up to the package type how to interpret it.

Go would change from [0,n-1) of the Go package ID split by / in the PURL namespace and segment n-1 in the PURL name, to the PURL name and the Go name being equal. Ergonomics are improved because the user no longer needs to split and join.

NPM would change from the NPM namespace in the PURL namespace and the NPM name in the PURL name to the full package name in the PURL name. NPM does have namespaces, but most of the time you don't need to be aware of them and just use the full package name, and it would be possible to do the same with PURL. Ergonomics are improved because the user no longer needs to split and join.

Maven would change from the Maven group ID in the PURL namespace and the Maven artifact ID in the PURL name to "<group ID>/<artifact ID>" in the PURL name. In this case, the ergonomics are worse because the splitting and joining is left up to the user. Maven tools don't typically specify packages this way.

Rewriting the spec this way shouldn't change the representation of any packages, so even though it would be a breaking API change for libraries, it wouldn't be a breaking change for the ecosystem and we wouldn't need to migrate everything to an incompatible PURL2 or deal with Go PURLs that are full of %2F escapes.

¹ Is this alone okay? For URL, the path segments are tricky. If you use a normal URL parser and ask for the full path of the URL, it needs to give you the path without fully percent decoding it in case / vs %2F is a meaningful distinction (eg it's a route parameter character, not a path segment delimiter). Separating the segments is supposed to happen before decoding. For PURL, as long as none of the existing package types have valid packages where the current name-without-namespace field is expected to contain a slash, and we don't expect package types to add such a requirement later, it should be safe for the library to return a single decoded name string.

matt-phylum avatar Mar 21 '24 19:03 matt-phylum

see also: #308 -- this proposes to deprecate golang for various reasons and publish a go definition that otherwise would cause breaking changes.

jkowalleck avatar Mar 19 '25 09:03 jkowalleck