furl
furl copied to clipboard
Hi, if the URL is like '127.0.0.1:8080/a/b', the result is '8080/a/b', which seems incorrect
from furl import furl f = furl('127.0.0.1:8080/a/b') f furl('8080/a/b')
great find!
this appears to be two issues.
issue #1: '127.0.0.1:8080/a/b'` is actually ambiguous: it could be parsed as
- host
127.0.0.1
, port8080
, and path/a/b
or as
- path
127.0.0.1:8080/a/b
(127.0.0.1:8080/a/b
is a valid path)
in this ambiguity, furl defaults to the latter and parses 127.0.0.1:8080/a/b
as a path. see https://github.com/gruns/furl/issues/110
to force furl to parse 127.0.0.1:8080/a/b
as an IP + port, include a scheme, eg
>>> f = furl('https://127.0.0.1:8080/a/b')
>>> f.url
'https://127.0.0.1:8080/a/b'
issue #2: that said, when furl parses '127.0.0.1:8080/a/b'
as a path, i don't know why 127.0.0.1:
is being dropped. that's strange
a bit more digging reveals it's the fact that the segment before the colon (which is a scheme separator, eg tel:555-555-5555
) can't start with a number. eg:
>>> f = furl('t.t:8000/a/b')
>>> f.url
t.t:8000/a/b
>>> f = furl('1.1:8000/a/b')
>>> f.url
8000/a/b
>>> f = furl('t.5:8000/a/b')
>>> f.url
t.5:8000/a/b
this is likely an implementation detail of how https://docs.python.org/3/library/urllib.parse.html parses single colon schemes, like tel:
, and drops single colon schemes that start with a digit