packageurl-python
packageurl-python copied to clipboard
PackageURL not properly re-encoding strings when rendering to string
When passing in a URL encoded name to PackageURL.from_string, it de-encodes the string, which is correct to have the actual name. However, when rendering this out as a string, it does not re-encode the string, resulting in an incorrect PURL.
>>> import packageurl
>>> from urllib.parse import quote_plus
>>> quote_plus("parent/child")
'parent%2Fchild'
>>> p = packageurl.PackageURL.from_string(f"pkg:my_type/my_namepace/{quote_plus('parent/child')}/@1234")
>>> p
PackageURL(type='my_type', namespace='my_namepace', name='parent/child', version='1234', qualifiers={}, subpath=None)
That is correct, as the name is parent/child. However:
>>> str(p)
'pkg:my_type/my_namepace/parent/child@1234'
Which is an invalid/incorrect PURL.
The fix looks easy. This line https://github.com/package-url/packageurl-python/blob/main/src/packageurl/init.py#L458 instead of being
purl.append(name)
looks like it should be
purl.append(urllib.parse.quote_plus(name))
I've been thinking about this some more, and I don't know if it's strictly a bug, or if it's spec compliant, but it does "break" in the round trip:
p = packageurl.PackageURL.from_string(f"pkg:my_type/my_namepace/{quote_plus('parent/child')}/@1234")
>>> p
PackageURL(type='my_type', namespace='my_namepace', name='parent/child', version='1234', qualifiers={}, subpath=None)
>>> str(p)
'pkg:my_type/my_namepace/parent/child@1234'
>>> p = packageurl.PackageURL.from_string(str(p))
>>> p
PackageURL(type='my_type', namespace='my_namepace/parent', name='child', version='1234', qualifiers={}, subpath=None)
Note namespace and name change, whereas if PackageURL had retained the URL encoding upon __str__ invocation, it would have retained the name of parent/child.
Related to PR #123
So, another related issue. Is this a bug? Or is this expected behavior?
>>> p = PackageURL.from_string('pkg:maven/com.google.guava%[email protected]')
>>> p
PackageURL(type='maven', namespace=None, name='com.google.guava:guava', version='25.1-jre', qualifiers={}, subpath=None)
>>> str(p)
'pkg:maven/com.google.guava:[email protected]'
>>> PackageURL.from_string(str(p))
Traceback (most recent call last):
File "<input>", line 1, in <module>
PackageURL.from_string(str(p))
File "/opt/homebrew/lib/python3.11/site-packages/packageurl/__init__.py", line 512, in from_string
raise ValueError(msg)
ValueError: Invalid purl 'pkg:maven/com.google.guava:[email protected]' cannot contain a "user:pass@host:port" URL Authority component: ''.
What is the proper behavior here?