tld.js icon indicating copy to clipboard operation
tld.js copied to clipboard

Returns null for tld.getDomain('http://github.io')

Open makecontact opened this issue 6 years ago • 5 comments

Always returns null for github.io + many others including

  • gitlab.io
  • ngrok.io
  • sandcats.io

etc...

var tld = require('tldjs');
console.log(tld.getDomain('http://github.io'));

Any takers?

I've tracked it down to any domain that is listed in tlds/rules.json

makecontact avatar Feb 28 '18 06:02 makecontact

Hi @makecontact

You are right to point out that any domain from tlds/rules.json (or from https://publicsuffix.org/list/effective_tld_names.dat directly), will have a null domain and public suffix of the value found in the list (e.g.: gitlab.io is a valid public suffix).

This is a long-standing and known issue which is not trivial to fix. It stems from the fact that the public suffix list was originally designed to check under which domains, sub-domains can be registered, and cookies can be set. In turn, it can lead to surprising/un-intuitive results such as the ones you encountered.

We've thought about this situation in the past, and I can see a few solutions, none of which is perfect. But maybe it would be "good enough":

  1. One hacky fix you can use right now without any update of TLD is to detect when domain is null, and instead use the value publicSuffix as the domain. This will work for a lot of domains (I will try to investigate more how many domain would return the wrong result with this solution, but I expect not so many).
  2. There are currently two parts in the public suffix list: ICANN and PRIVATE. I suspect that most of the surprising cases come from the PRIVATE part. We could add an option in tld.js to only take ICANN domains into account (this would fix the examples you found and many others).
  3. Combine both 1. and 2. (most of the counter-examples seem to be japan domain, but we need to investigate a bit more to see if there are some non-trivial cases).

None of the solution is perfect as there are known counter-examples. If this is an option for you, I would suggest you give a try to 1. and I will try to implement 2..

Also, as far as I know, this should be a limitation for all libraries using the public suffix lists unfortunately.

remusao avatar Feb 28 '18 07:02 remusao

@remusao thank you for taking the time to write a good reply. I can appreciate that this is a problem that can't easily be solved but I'm happy with the work arounds you've suggested.

makecontact avatar Feb 28 '18 12:02 makecontact

+1, also returns null for 1password.com or https://1password.com

mrlubos avatar Jul 16 '18 12:07 mrlubos

@lmenus Thanks for coming up with another breaking case. As far as I can tell, this would be fixed by #128. Hopefully we can move this work forward soon!

remusao avatar Jul 16 '18 19:07 remusao

@remusao Looks good, thank you for your work on this!

mrlubos avatar Jul 17 '18 16:07 mrlubos