vulnerablecode icon indicating copy to clipboard operation
vulnerablecode copied to clipboard

Continue reporting/displaying PURLs with URL encoding?

Open johnmhoran opened this issue 2 years ago • 2 comments

This is related to https://github.com/nexB/vulnerablecode/issues/1228 and https://github.com/nexB/vulnerablecode/issues/1252.

I noticed yesterday while working on https://github.com/nexB/vulnerablecode/issues/1228 that my tests using univers (https://github.com/nexB/univers) to compare affected and fixed by versions threw a univers.versions.InvalidVersion: '2.12.1-1%2Bdeb11u1' is not a valid <class 'univers.versions.DebianVersion'> error when I included pkg:deb/debian/[email protected]%2Bdeb11u1 in the test.

I eventually figured out that the culprit was the string %2B -- the URL-encoded + that debian.org uses for this jackson-databind package. (See, e.g., https://developer.mozilla.org/en-US/docs/Glossary/Percent-encoding.) Importing urllib.parse and using the unquote() function enabled me to complete the comparison without error, and is now part of my draft tests as well:

        # Test the error
        with pytest.raises(versions.InvalidVersion):
            assert versions.DebianVersion("2.12.1-1%2Bdeb11u1") < versions.DebianVersion(
                "2.13.1-1%2Bdeb11u1"
            )
        # Decode the version and test.
        assert versions.DebianVersion(
            urllib.parse.unquote("2.12.1-1%2Bdeb11u1")
        ) < versions.DebianVersion(urllib.parse.unquote("2.13.1-1%2Bdeb11u1"))

My question: do we want to continue this approach, or would we prefer instead to use non-URL-encoded versions in our PURLs?

If we search in vulnerablecode.io for pkg:deb/debian/[email protected]%2Bdeb11u1 and pkg:deb/debian/[email protected]+deb11u1, we get the same results -- displayed in the UI, for example like this

pkg:deb/debian/[email protected]%2Bdeb11u1

with these respective links

https://public.vulnerablecode.io/packages/pkg:deb/debian/[email protected]%252Bdeb11u1?search=pkg:deb/debian/[email protected]%2Bdeb11u1

https://public.vulnerablecode.io/packages/pkg:deb/debian/[email protected]%252Bdeb11u1?search=pkg:deb/debian/[email protected]+deb11u1

FWIW, debian.org displays the non-encoded version 2.12.1-1+deb11u1 with an underlying link to a details page. See https://tracker.debian.org/pkg/jackson-databind.

johnmhoran avatar Aug 01 '23 18:08 johnmhoran

I see that packageurl.PackageURL.from_string() will also provide us with a decoded (or non-URL-encoded) version from a PURL.

import urllib.parse

import packageurl

deb_purl = "pkg:deb/debian/[email protected]%2Bdeb11u1"
decoded_deb_purl = urllib.parse.unquote(deb_purl)

print("\ndecoded_deb_purl = {}\n".format(decoded_deb_purl))

# Test PURL
purl = packageurl.PackageURL.from_string(deb_purl)
print("\npurl = {}\n".format(purl))
print(purl.type)
print(purl.namespace)
print(purl.name)
print(purl.version)
print(purl.qualifiers)
print(purl.subpath)

produces this output:

decoded_deb_purl = pkg:deb/debian/[email protected]+deb11u1


purl = pkg:deb/debian/[email protected]%2Bdeb11u1

deb
debian
jackson-databind
2.12.1-1+deb11u1
{}
None

johnmhoran avatar Aug 01 '23 21:08 johnmhoran

Hi, I’d like to start looking into this area.

From what I understand: • #1252 is about choosing the best fixed version — ideally the lowest version that fixes the vulnerability and is itself not vulnerable. • #1253 highlights that URL-encoded Debian versions (%2B) cause issues in version comparison, and we already decode internally via PackageURL.from_string() or urllib.parse.unquote().

Before I start experimenting, a quick clarification:

Should the “best fixed version” logic always run on the decoded version string (e.g., 2.12.1-1+deb11u1), even if the stored PURL uses %2B? If yes, then the path is straightforward: • normalize versions at import time • run version ordering + univers comparisons on the decoded value • when reporting UI/API results, use the decoded version • keep the original PURL intact for lookup

This avoids univers errors and keeps fixed-version evaluation consistent across ecosystems (Debian especially).

If this direction looks right, I’ll put together a small proposal + initial patch.

vaibhav11123 avatar Dec 04 '25 04:12 vaibhav11123