packageurl-js icon indicating copy to clipboard operation
packageurl-js copied to clipboard

Invalid purl passes as valid

Open surendrapathak opened this issue 1 year ago • 2 comments

PURL - 'pkg:maven/org.apache.commons:[email protected]' passes as valid

p.PackageURL.fromString('pkg:maven/org.apache.commons:[email protected]') PackageURL { type: 'maven', name: 'org.apache.commons:io', namespace: null, version: '1.3.4', qualifiers: null, subpath: null }

However, it shouldn't be because the namespace is not percent-encoded.

surendrapathak avatar Mar 15 '24 01:03 surendrapathak

If you .toString() it then it's encoded:

p.PackageURL.fromString('pkg:maven/org.apache.commons:[email protected]').toString()
// => 'pkg:maven/org.apache.commons%[email protected]'

jdalton avatar May 17 '24 20:05 jdalton

Hi @jdalton - The concern is that PURL such as:

pkg:maven/org.apache.commons:[email protected]

should be marked invalid for violating the namespace rule: Screenshot 2024-05-19 at 2 20 52 PM

However, because packageurl-js parses these successfully, they show up in SBOM as valid entries, causing identification issues.

surendrapathak avatar May 19 '24 21:05 surendrapathak

The exact maven examples fails on master branch with:

Error Invalid purl: maven requires a "namespace" field.

If adding a namespace/name:

PackageURL.fromString('pkg:maven/org.apache.commons:io/[email protected]')

will still allow it as parsing tends to be more lenient as they become arguments to feed to the new PackageURL(...) constructor. When the purl instance is converted toString() it is encoded.

Both

PackageURL.fromString('pkg:maven/org.apache.commons:io/[email protected]')
PackageURL.fromString('pkg:maven/org.apache.commons%3Aio/[email protected]')

Produce the same result:

PackageURL {
  type: 'maven',
  name: 'foo',
  namespace: 'org.apache.commons:io',
  version: '1.3.4',
  qualifiers: undefined,
  subpath: undefined
}

This is by design.

jdalton avatar Aug 13 '24 16:08 jdalton

Thanks for the note @jdalton

I am getting these:

 PackageURL {
  type: 'maven',
  name: 'org.apache.commons:io',
  namespace: null,
  version: '1.3.4',
  qualifiers: null,
  subpath: null
}

and

PackageURL {
  type: 'maven',
  name: 'foo',
  namespace: 'org.apache.commons:io',
  version: '1.3.4',
  qualifiers: null,
  subpath: null
}

You can see that in the first case without a valid package name ('foo'), the entire intended namespace ('org.apache.commons:io') is being treated as a name.

This is why the specification required each namespace segment to be percent-encoded. I understand it is designed this way, but it will indeed create confusion without strict enforcement of the PURL spec.

surendrapathak avatar Aug 14 '24 18:08 surendrapathak

@surendrapathak I typo'd the example 🤦 . It was supposed to be:

PackageURL.fromString('pkg:maven/org.apache.commons:io/[email protected]')
PackageURL.fromString('pkg:maven/org.apache.commons%3Aio/[email protected]')

with the namespace as master branch will throw an error for maven purls without a namespace.

As stated before encoding is done at the toString() level. The folks behind the PURL spec side with encoding just enough to prioritize human readability. You can see this in how URLSearchParams are stored human readable and then encoded when the .toString() method is called

jdalton avatar Aug 14 '24 22:08 jdalton

Thanks!

🥇2.0.0 correctly flags the one without the name and matches the spec!

I appreciate the change around it.

Screenshot 2024-08-19 at 4 58 32 PM

surendrapathak avatar Aug 20 '24 00:08 surendrapathak

@surendrapathak Yay! v2.0.0 ftw 🎉

jdalton avatar Aug 20 '24 12:08 jdalton