httpx icon indicating copy to clipboard operation
httpx copied to clipboard

Some URLs with unescaped query strings cannot resolve relative redirects.

Open tomchristie opened this issue 2 years ago • 4 comments

Discussed in https://github.com/encode/httpx/discussions/1816

Originally posted by kangzhang August 25, 2021 Hey there, we are running into a crash on httpx and wondering if we could get some help.

It seems rfc3986 is confused by certain URLs. To reproduce the bug:

from httpx import Client
Client().get("https://reason.com/search/?f[author][]=nick-gillespie&q=*&s=-pubdate")

It would throw a ResolutionError error about the URL.

Looking into the code, I'm wondering if we should patch this line and also set query to None: https://github.com/encode/httpx/blob/master/httpx/_models.py#L572

base_uri = self._uri_reference.copy_with(fragment=None, query=None)

The raw exception message:

  File "httpx/_client.py", line 1776, in head
    return await self.request(
  File "httpx/_client.py", line 1481, in request
    response = await self.send(
  File "httpx/_client.py", line 1568, in send
    response = await self._send_handling_auth(
  File "httpx/_client.py", line 1604, in _send_handling_auth
    response = await self._send_handling_redirects(
  File "httpx/_client.py", line 1658, in _send_handling_redirects
    raise exc
  File "httpx/_client.py", line 1647, in _send_handling_redirects
    request = self._build_redirect_request(request, response)
  File "httpx/_client.py", line 456, in _build_redirect_request
    url = self._redirect_url(request, response)
  File "httpx/_client.py", line 508, in _redirect_url
    url = request.url.join(url)
  File "httpx/_models.py", line 573, in join
    return URL(relative_url._uri_reference.resolve_with(base_uri).unsplit())
  File "rfc3986/_mixin.py", line 266, in resolve_with
    raise exc.ResolutionError(base_uri)

Thank you!

The shortest way to reproduce this issue is something like...

>>> httpx.URL("https://example.com/?[]").join("/")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/tomchristie/GitHub/encode/httpx/httpx/_models.py", line 574, in join
    return URL(relative_url._uri_reference.resolve_with(base_uri).unsplit())
  File "/Users/tomchristie/GitHub/encode/httpx/venv/lib/python3.6/site-packages/rfc3986/_mixin.py", line 266, in resolve_with
    raise exc.ResolutionError(base_uri)
rfc3986.exceptions.ResolutionError: https://example.com/?[] is not an absolute URI.

The REF3986 package is using a regex to determine if a URL is in absolute form or not, and the query part does not match the regex. I think @kangzhang's suggestion to remove the query portion when joining URLs would resolve this, and looks like it still would give correctly resolved redirects, but it'd be worth making the change, and seeing if anything in the test suite changes as a result of the change or not.

tomchristie avatar Sep 03 '21 09:09 tomchristie

I played with it a bit and the change did fail some tests and these cases seem legit. Namely https://github.com/encode/httpx/blob/0a8b44e67d470239f9659b6c3127af990303491f/tests/models/test_url.py#L183 and related ones.

I'm not an expert on RFC3986 to tell if the URL fits the spec (if this is a bug in rfc3986's implication). It would be the best if that library could handle this better/we use a more lenient library (internet is wild).

kangzhang avatar Sep 03 '21 16:09 kangzhang

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Feb 20 '22 15:02 stale[bot]

Still valid thx, @stale bot.

tomchristie avatar Feb 21 '22 13:02 tomchristie

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Mar 25 '22 07:03 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Oct 15 '22 19:10 stale[bot]