Host ']'
The following malformed URL is accepted by rfc3986:
B://]
Although the character ']' is allowed in a host, it must be in the context of an IPv6 or an IPvFuture, which this is not.
This malformed URL is rejected by urllib, urllib3, hyperlink, yarl, furl, and Boost.URL.
Also a problem when host is '['
Happy to accept a fix here
The question here is whether the validation should or should not be implicit when using uri_reference. The truth is that it parses the URI and returns an invalid result by default:
In [1]: from rfc3986 import uri_reference
In [2]: uri_reference("B://]")
Out[2]: URIReference(scheme='B', authority=']', path=None, query=None, fragment=None)
But you can simply check the validity of the result by the is_valid() method:
In [3]: _.is_valid()
Out[3]: False
If you want the validation to happen immediately during parsing, you can use ParseResult:
In [4]: from rfc3986 import ParseResult
In [5]: ParseResult.from_string("B://]")
…
InvalidAuthority: The authority (]) is not valid.
So, because the lib is already able to say that the URL is invalid, it's more a design decision if the validation should happen by default.
Pretty sure uri_reference was always intended to be non-validating so that the rest could be API compatible with urllib3 more or less.
Maybe we need to make that clearer in the docs
@kenballus could you please clarify which approach did you use?
My mistake; using ParseResult.from_string fixes this. I should have read the docs more thoroughly.
Why is it that we have a non-validating parser? Some large projects (httpx) use this parser, and would probably be better off using ParseResult.from_string.