yarl icon indicating copy to clipboard operation
yarl copied to clipboard

RFC3986 compatible URL.join honoring empty segments

Open commonism opened this issue 1 year ago • 10 comments

What do these changes do?

The changes re-implement URL.join within yarl.URL instead of using urllib.parse.urljoin to avoid known issues when joining paths with empty segments (https://github.com/python/cpython/issues/84774)

Are there changes in behavior for the user?

Due to the unit tests for URL.join it was required to align the query parameter rendering to match the unit tests, empty parameters are rendered without value therefore instead of an empty value.

Related issue number

Checklist

  • [x] I think the code is well written
  • [x] Unit tests for the changes exist
  • [x] Documentation reflects the changes

commonism avatar Jul 18 '24 18:07 commonism

Codecov Report

Attention: Patch coverage is 73.49398% with 22 lines in your changes missing coverage. Please review.

Project coverage is 62.60%. Comparing base (84e1c7d) to head (f94589c). Report is 355 commits behind head on master.

Files with missing lines Patch % Lines
tests/test_url.py 59.52% 17 Missing :warning:
yarl/_url.py 87.80% 5 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1039      +/-   ##
==========================================
+ Coverage   62.37%   62.60%   +0.23%     
==========================================
  Files          38       38              
  Lines        6626     6721      +95     
  Branches      356      368      +12     
==========================================
+ Hits         4133     4208      +75     
- Misses       2466     2486      +20     
  Partials       27       27              
Flag Coverage Δ
CI-GHA 62.57% <73.49%> (+0.23%) :arrow_up:
MyPy 25.91% <39.02%> (+0.30%) :arrow_up:
OS-Linux 99.26% <100.00%> (+0.01%) :arrow_up:
OS-Windows 99.59% <100.00%> (+<0.01%) :arrow_up:
OS-macOS 99.03% <100.00%> (+0.01%) :arrow_up:
Py-3.10.11 98.94% <100.00%> (+0.04%) :arrow_up:
Py-3.10.14 99.15% <100.00%> (+0.04%) :arrow_up:
Py-3.11.9 99.15% <100.00%> (+0.04%) :arrow_up:
Py-3.12.5 99.15% <100.00%> (+0.04%) :arrow_up:
Py-3.13.0-rc.1 99.15% <100.00%> (+0.04%) :arrow_up:
Py-3.8.10 98.88% <100.00%> (+0.01%) :arrow_up:
Py-3.8.18 99.09% <100.00%> (+0.01%) :arrow_up:
Py-3.9.13 98.88% <100.00%> (+0.01%) :arrow_up:
Py-3.9.19 99.09% <100.00%> (+0.01%) :arrow_up:
Py-pypy7.3.11 99.40% <100.00%> (+<0.01%) :arrow_up:
Py-pypy7.3.16 99.43% <100.00%> (+<0.01%) :arrow_up:
VM-macos-latest 99.03% <100.00%> (+0.01%) :arrow_up:
VM-ubuntu-latest 99.26% <100.00%> (+0.01%) :arrow_up:
VM-windows-latest 99.59% <100.00%> (+<0.01%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Jul 18 '24 18:07 codecov[bot]

Towncrier ain't happy with my intersphinx link to python docs:

/home/runner/work/yarl/yarl/docs/[towncrier-fragments]:45: WARNING: py:meth reference target not found: python.urllib.parse.urljoin

commonism avatar Jul 19 '24 07:07 commonism

As

:py:function:`urllib.parse.urljoin`

does not work either

/home/runner/work/yarl/yarl/docs/changes.rst:45: ERROR: Unknown interpreted text role "py:function".

But according to

python3 -m sphinx.ext.intersphinx https://docs.python.org/3/objects.inv | grep -E "(^py:.*|urllib.parse.urljoin)"
…
py:function
	urllib.parse.urljoin                     library/urllib.parse.html#urllib.parse.urljoin
…

this should work and the intersphinx is set up in the job

loading intersphinx inventory 'python' from https://docs.python.org/3/objects.inv...
loading intersphinx inventory 'multidict' from https://multidict.aio-libs.org/en/stable/objects.inv...

@Dreamsorcerer @webknjaz any ideas?

commonism avatar Jul 19 '24 19:07 commonism

@commonism use :func:. That table shows object types with differs from the role names sometimes.

In my helper, I tried replacing them with proper reference hints: https://webknjaz.github.io/intersphinx-untangled

webknjaz avatar Jul 19 '24 20:07 webknjaz

Towncrier ain't happy with my intersphinx link to python docs:

Technically, it's not Towncrier that's unhappy. Towncrier is just a fancy template renderer. But when that's injected into Sphinx (through my sphinxcontrib-towncrier extension), it can emit warnings and we turn them into errors.

webknjaz avatar Jul 19 '24 20:07 webknjaz

The python domain maps function to func, therefore it is :py:func: https://github.com/sphinx-doc/sphinx/blob/685e3fdb49c42b464e09ec955e1033e2a8729fff/sphinx/domains/python.py#L845-L881

To many docs for intersphinx show :py:function: as example.

That said, mastered this as well.

commonism avatar Jul 19 '24 21:07 commonism

linter currently dies with:

Extension error (sphinxcontrib.towncrier.ext):
Handler <bound method TowncrierDraftEntriesEnvironmentCollector.get_outdated_docs of <sphinxcontrib.towncrier.ext.TowncrierDraftEntriesEnvironmentCollector object at 0x7fd1132dac90>> for event 'env-get-outdated' threw an exception (exception: find_fragments() takes 3 positional arguments but 4 were given)
make[1]: *** [Makefile:228: spelling] Error 2
updating environment: make[1]: Leaving directory '/home/runner/work/yarl/yarl/docs'
make: *** [Makefile:64: doc-spelling] Error 2
Error: Process completed with exit code 2.

commonism avatar Aug 08 '24 06:08 commonism

linter currently dies with:

Looks like there's a problem with the requirements files. It ends up installing the latest towncrier instead of the pinned version.

Dreamsorcerer avatar Aug 08 '24 13:08 Dreamsorcerer

Added skip label as a workaround since changelog bot is currently not working

bdraco avatar Aug 14 '24 14:08 bdraco

close/reopen to rerun ci

bdraco avatar Aug 14 '24 21:08 bdraco

Ran a profile. Only thing I noticed is schema comes up a lot in yarl. Probably would make sense to put that one in the dict to and avoid the property call.. but thats another issue

bdraco avatar Aug 30 '24 22:08 bdraco

Tested one production. No adverse effect observed

bdraco avatar Aug 30 '24 22:08 bdraco

Thanks @commonism

bdraco avatar Aug 30 '24 22:08 bdraco

This PR introduced a regression in decoding query string params in joined urls.

I attempted to fix it in https://github.com/aio-libs/yarl/pull/1066 but the fix looks more involved so I'm going to revert this PR and do another release

I think this needs to be reworked to use URL(self._val._replace(.....), encoded=True) instead like some of the other methods but more analysis is needed as there is some complex re-quoting logic for query strings already.

I would like to encourage another attempt at solving this problem.

bdraco avatar Aug 31 '24 04:08 bdraco