furl icon indicating copy to clipboard operation
furl copied to clipboard

Hi, if the URL is like '127.0.0.1:8080/a/b', the result is '8080/a/b', which seems incorrect

Open snails-za opened this issue 3 years ago • 2 comments

snails-za avatar Sep 07 '21 03:09 snails-za

from furl import furl f = furl('127.0.0.1:8080/a/b') f furl('8080/a/b')

snails-za avatar Sep 07 '21 03:09 snails-za

great find!

this appears to be two issues.

issue #1: '127.0.0.1:8080/a/b'` is actually ambiguous: it could be parsed as

  • host 127.0.0.1, port 8080, and path /a/b

or as

  • path 127.0.0.1:8080/a/b (127.0.0.1:8080/a/b is a valid path)

in this ambiguity, furl defaults to the latter and parses 127.0.0.1:8080/a/b as a path. see https://github.com/gruns/furl/issues/110

to force furl to parse 127.0.0.1:8080/a/b as an IP + port, include a scheme, eg

>>> f = furl('https://127.0.0.1:8080/a/b')
>>> f.url
'https://127.0.0.1:8080/a/b'

issue #2: that said, when furl parses '127.0.0.1:8080/a/b' as a path, i don't know why 127.0.0.1: is being dropped. that's strange

a bit more digging reveals it's the fact that the segment before the colon (which is a scheme separator, eg tel:555-555-5555) can't start with a number. eg:

>>> f = furl('t.t:8000/a/b')
>>> f.url
t.t:8000/a/b
>>> f = furl('1.1:8000/a/b')
>>> f.url
8000/a/b
>>> f = furl('t.5:8000/a/b')
>>> f.url
t.5:8000/a/b

this is likely an implementation detail of how https://docs.python.org/3/library/urllib.parse.html parses single colon schemes, like tel:, and drops single colon schemes that start with a digit

gruns avatar Sep 10 '21 19:09 gruns