furl
furl copied to clipboard
'127.0.0.1:8329' parsed wrong in Python 3.9+
Python 3.9.13 (main, Jun 19 2022, 13:12:56)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import furl
>>> furl.furl('127.0.0.1:8329').url
'8329'
>>> from six.moves import urllib
>>> urllib.parse.urlsplit('127.0.0.1:8329')
SplitResult(scheme='127.0.0.1', netloc='', path='8329', query='', fragment='')
Python 3.8.10 (default, Mar 15 2022, 12:22:08)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import furl
>>> furl.furl('127.0.0.1:8329').url
'127.0.0.1:8329'
>>> from six.moves import urllib
>>> urllib.parse.urlsplit('127.0.0.1:8329')
SplitResult(scheme='', netloc='', path='127.0.0.1:8329', query='', fragment='')
This is due to a fix that was done on Python3.9 (https://github.com/python/cpython/issues/71844) that changed the scheme
field in case no scheme
is provided.
This fix also broke requests behavior and they had to replace their parsing method, with urllib3
(https://github.com/psf/requests/pull/5917).
I suggest to fix this issue with urllib3
too
This is my solution on our code, until furl
is fixed.
import furl
def urlsplit_based_on_urllib3(url):
"""
Returns same values as `urllib.parse.urlsplit` returns before Python3.9
>>> urlsplit_based_on_urllib3('127.0.0.1:8329')
(None, None, '127.0.0.1:8329', None, None)
"""
from urllib3.util import parse_url
u = parse_url(url)
if u.netloc and not u.path:
return u.scheme, None, u.netloc, u.query, u.fragment
return u.scheme, u.netloc, u.path, u.query, u.fragment
try:
furl.urllib.parse.urlsplit = urlsplit_based_on_urllib3
except:
logger.error("Failed to fix furl urlsplit usage")
>>> furl.furl('127.0.0.1:8329')
furl('127.0.0.1:8329')
thank you for opening this issue! great catch, and thank you for providing the super helpful links for context
let's fix this; consistency is key. do you have time to submit a PR which replaces furl's version of urlsplit
(https://github.com/gruns/furl/blob/master/furl/furl.py#L284), which is built with six.moves.urllib.parse.urlsplit()
with one based on urllib3?
thank you!
Using urllib3 may not be ideal because that project officially supports only HTTP URLs.
This can cause problems with other schemes. For example, when parsing a tel
URL with urllib3, a leading '/'
is prepended to the phone number.