Encode ^ in pathname
u = new URL('http://abc.com/a^b');
console.log(u.pathname);
This gives "a%5Eb" in Chrome and Firefox, in addition to Go and Node.js. Ruby's URI fails to parse the URL with ^, but is fine with %5E. However, the spec and Safari don't escape ^ at the moment.
Shall we escape ^ in paths? This will cause U+005E (^) to be moved from the userinfo set to path set.
It seems to depend on "is special" in Chrome and Firefox, which isn't ideal.
I mean, Chrome and Firefox don't even escape spaces (or anything not in the C0 controls set) in non-special paths…
3986 defines path segments to contain these characters:
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
That doesn't include ^, so it needs to be percent-encoded.
@achristensen07 Are you okay with aligning Safari on this?
I think so. This is a case where Chrome and Firefox have the same behavior, so aligning with them would likely increase compatibility. It looks like the most compatible solution depends on "is special" Like I said in issue 608, we really need complete tests with each ASCII code point in each part of a URL with and without a special scheme.
FYI. The discussion around issue #379 has a good overview of the percent encode sets.
FWIW, I've been looking at interoperability between this standard and the URL type in Apple's Foundation framework (which I assume would also be of interest to WebKit). It is documented as conforming to RFC-1738.
The biggest difficulty in getting Foundation to parse the serialised output of this standard is the difference in percent-encode sets. This makes it harder for applications to transition to a web-compatible URL model, as converting to a Foundation URL means adding percent-encoding, so the serialised URL string changes. Anything which minimises that would be appreciated, and if it is actually a better description of how browsers behave, it seems like a no-brainer.
That said, if Safari currently does not encode it, and Chrome/Firefox conditionalise it, and neither of them "broke the web", it seems reasonable to conclude that few if any sites actually care whether it is encoded or not. In that case, the better choice IMO would be to unconditionally encode it and align with RFC-3986 as a bonus. Conditional percent-encode sets are awful.