url icon indicating copy to clipboard operation
url copied to clipboard

Expose a URLHost class to JavaScript

Open annevk opened this issue 8 years ago • 20 comments
trafficstars

  • [ ] At least two implementers are interested (and none opposed):
  • [ ] Tests are written and can be reviewed and commented upon at:
  • [ ] Implementation bugs are filed:
    • Chromium: …
    • Gecko: …
    • WebKit: …
    • Deno: …
    • Node.js: …
  • [ ] MDN issue is filed: …

(See WHATWG Working Mode: Changes for more details.)


Preview | Diff

annevk avatar Mar 31 '17 13:03 annevk

@tabatkins it seems a little weird that just because toJSON is the same as the stringification behavior, it needs to be annotated as a stringifier rather than a method, is that really how this should work?

annevk avatar Mar 31 '17 13:03 annevk

it seems a little weird that just because toJSON is the same as the stringification behavior, it needs to be annotated as a stringifier rather than a method, is that really how this should work?

Just depends on how you want it to link. To get stringifier to link, define "stringification behavior". To get toJSON() to link, define the toJSON() method. Or do both.

The "stringification behavior" thing is mostly to handle anonymous stringifiers. I've thought about also just making it an implicit toString() method if no explicit one exists. (Even if you say stringifier toJSON(), you still get a toString() defined by it as well.)

tabatkins avatar Mar 31 '17 20:03 tabatkins

@tabatkins I defined both, but toJSON() doesn't link and I end up with duplicate IDs somehow.

annevk avatar Apr 01 '17 14:04 annevk

@annevk - Is this something expected to land. I am working on updating documentation around stuff defined in this spec and it would be nice to be clear on what the state of things is. :)

a2sheppy avatar May 28 '19 11:05 a2sheppy

I don't expect this to land in this form. If something moves here I'll add the documentation team to make sure you all are informed.

annevk avatar May 28 '19 11:05 annevk

@tabatkins so the weird thing is that in stringifier attribute USVString href; href is seen as an attribute but in stringifier USVString toJSON () toJSON() is seen as a stringifier instead of a method. Why is that?

annevk avatar Apr 26 '20 15:04 annevk

Hmm, I catch stringifier specially, so likely I just didn't catch it in the attribute form. I'll look into it.

tabatkins avatar Apr 26 '20 19:04 tabatkins

@tabatkins wouldn't going that way break existing specifications? I guess that's another way to go though...

annevk avatar May 04 '20 13:05 annevk

@annevk ... at this point, what is the likelihood of this moving forward?

jasnell avatar Mar 28 '22 20:03 jasnell

This still seems like something the web platform should offer, but I'd rather wait until browsers have more aligned URL parsers and IDNA handling before making another push to expose this API.

annevk avatar Apr 01 '22 05:04 annevk

@valenting @ricea is there interest from Gecko and Chromium in this API addition? Now that we're close with IDNA this seems like a nice improvement. Note that this intentionally does not expose ToUnicode. Doing that responsibly requires a separate effort.

annevk avatar Feb 17 '23 10:02 annevk

I don't feel like I could confidently write an explainer for this.

ricea avatar Feb 17 '23 10:02 ricea

I don't know if there's enough benefit to add it in its current form. If I'm reading it correctly then what it's bringing is an easy way to check whether a host is IPv4/v6/a domain. Is there much need for that?

valenting avatar Feb 17 '23 11:02 valenting

It also gives an easy way to parse a host. Which can be useful if your chosen scheme always gives you an opaque host (or IPv6). And more ergonomic than something like new URL("https://" + host); which might also do the wrong thing for certain inputs.

annevk avatar Feb 17 '23 12:02 annevk

It seems you can get much the same functionality by abusing URL.protocol.host:

const c = new URL('https://example.com')
c.host = '😀'
'😀'
c.host
'xn--e28h'
c.host = 564
564
c.host
'0.0.2.52'

Not that I'd call that a good API, but if the functionality is only needed by a small minority of developers, it might be good enough?

ricea avatar Feb 17 '23 12:02 ricea

Eww. Maybe? There definitely seems to be merit to this if we expose more of IDNA or the PSL: https://www.npmjs.com/search?q=domain%20parser and https://www.npmjs.com/search?q=idna. (And given the number of downloads of the packages there I'm not sure if it's a small minority that cares about hosts.)

annevk avatar Feb 17 '23 12:02 annevk

If it were to (safely) include ToUnicode or other new IDNA functionality it would be easy to say we should add it. But right now it seems to be more like syntactic sugar. I'm not strongly against it, but I don't think there's a strong case for it right now.

valenting avatar Feb 17 '23 12:02 valenting

To address an earlier question, there is demand for checking whether a string is an IP address: #696. And judging from https://www.npmjs.com/package/ipaddr.js this is very popular. Given how many strings can be turned into IP addresses offering an authoritative answer to that question would be good I think.

annevk avatar Mar 06 '23 02:03 annevk

We are definitely interested in implementing this in Ada & Node.js.

anonrig avatar Jun 26 '23 16:06 anonrig

FYI I have implemented this in my Swift library: documentation.

For low-level networking applications, you'll find that they often pass the hostname through inet_pton to decide whether they have an IP address or should use getaddrinfo (or equivalent) to look up the name. That's not really ideal - the URL parser already knows this information, so we can just tell them directly what kind of host they're looking at. At the same time, we don't need to traffic in string values -- we can give them values using a rich IPv4Address type. That means things like filtering become a lot easier:

if case .ipv4Address(let address) = url.host,
   case (10, 0, 0, _) = address.octets {
  // URL has host "10.0.0.???"
}

It's quite nice. I'm very happy with it. Possibly less useful on the web, but I could imagine NodeJS could make use of something like this.

Another facet to this API that is quite useful is the ability to parse an opaque hostname in the context of a known URL scheme. In the documentation, I give the example of processing ssh: URLs -- the standard says their hostnames are opaque, but in reality, applications will want to process them as if they were part of an http: URL (so they get IDNA and IPv4 detection).

// 🚩 "http:" URLs use a special Unicode -> ASCII conversion
//    (called "IDNA"), designed for compatibility with existing
//    internet infrastructure.

let httpURL = WebURL("http://alice@أهلا.com/data")!
httpURL       // "http://[email protected]/data"
              //               ^^^^^^^^^^^
httpURL.host  // ✅ .domain(Domain { "xn--igbi0gl.com" })

// 🚩 "ssh:" URLs have opaque hostnames, so Unicode characters
//    are just percent-encoded. The URL Standard doesn't even know
//    this a network address, so we don't get any automatic processing.

let sshURL = WebURL("ssh://alice@أهلا.com/data")!
sshURL       // "ssh://alice@%D8%A3%D9%87%D9%84%D8%A7.com/data"
             //              ^^^^^^^^^^^^^^^^^^^^^^^^
sshURL.host  // 😐 .opaque("%D8%A3%D9%87%D9%84%D8%A7.com")

// 🚩 Using the WebURL.Host initializer, we can interpret our
//    SSH hostname as if it were in an HTTP URL.

let sshAsHttp = WebURL.Host(sshURL.hostname!, scheme: "http")
// ✅ .domain(Domain { "xn--igbi0gl.com" })

IPv4 support:

let url = WebURL("ssh://[email protected]/data")!
url       // "ssh://[email protected]/data"
url.host  // 😐 .opaque("192.168.15.21")
          //     ^^^^^^

WebURL.Host(url.hostname!, scheme: "http")
// ✅ .ipv4Address(IPv4Address { 192.168.15.21 })

Scenarios like that may be more broadly useful on the web.

karwa avatar Sep 12 '23 15:09 karwa