uriparse-rs
uriparse-rs copied to clipboard
Add support for zone identifiers
Closes: https://github.com/sgodwincs/uriparse-rs/issues/6
This is based on #25, please only consider the top commit. (But I'm using that PR in testing already; sadly there seems to be no way to tell GitHub that it's dependent on that PR). If #25 does get rejected, I can rebase it to get rid of those changes -- otherwise I think it's best to merge after it, and then things would be easy.
The implementation already follows https://datatracker.ietf.org/doc/draft-ietf-6man-rfc6874bis/ (the upcoming revision of RFC6874) in that it doesn't do any percent encoding of the percent sign.
As zone identifiers are necessarily non-empty, a URI with an IPv6 address without zone identifier is represented by an empty string as a zone identifier.
Testing
$ cargo run --example cli -- 'coap://[fe80::1%eth0]/'
<coap://[fe80::1%eth0]/>: Ok(
URIReference {
authority: Some(
Authority {
host: IPv6Address(
fe80::1,
"eth0",
),
password: None,
port: None,
username: None,
},
),
fragment: None,
path: Path {
absolute: true,
double_dot_segment_count: 0,
leading_double_dot_segment_count: 0,
segments: [
Segment {
normalized: true,
segment: "",
},
],
single_dot_segment_count: 0,
unnormalized_count: 0,
},
query: None,
scheme: Some(
CoAP,
),
},
)
API
Given that the enum Host
variants are exhaustive, this is an API breaking change -- and I don't see how it could be made non-breaking.
On Tue, Oct 11, 2022 at 10:47:29AM -0700, Scott Godwin wrote:
One other thing is that the types in this library generally keep track of whether they are normalized, but I'm not sure what it means for a zone ID to be normalized. It also doesn't specify whether it's case sensitive.
The zone identifiers come quite directly from the operating system after limiting to "unreserved", I don't see any mention of normalization, and with only unreserved being allowed there can be no denormalized forms.
Since the RFC says that a zone ID must be at least one character, I think it makes more sense to wrap this in an
Option
.[...]
I generally prefer using strong types rather than strings (as you can see by the library having
Port
,Username
,Password
, etc. types). I'd prefer if this was a newZoneID
type.
Fine with me; then the ZoneID type would have being non-empty as a validity constraint.
[...], something like
InvalidIPv6ZoneIDCharacter
No problem.
Thanks for reviewing, fixes coming up.
I think the update addresses all your points.
I've isolated some creation and error logic into a private ::new()
function that IMO looks prettier and has its unsafe part easier to match to the check that was done. This also showed that there is an error case I previously missed ([fe80::1%]
would have been silently accepted in the place of [fe80::1]
even though the former is invalid syntactically); I've grouped that with the InvalidZoneCharacter, but if you prefer, it's easy to split InvalidZoneCharacter from InvalidZoneEmpty.
If you're happy with the changes I'd smush them into a single commit.