Validation
Validation copied to clipboard
Rules/Tld not working for ccSLD
After the problem with the old tld array, which got a update in the meanwhile, i ran into the next tld problem: can't verify for country SLD's (https://en.wikipedia.org/wiki/Second-level_domain)
For the tld's I found the list on iana. But for the SLD's i couldn't find anything similar, to extend the list of Rules/Tld. A overview of the many ccSLD's: https://www.whois365.com/en/listtld/north-america
Maybe the Tld rule isn't meant to do this, but maybe it is. Or we might make another Sld rule, and for validation one can use v::oneOf
It may be beneficial to use the Public Suffix list. This is inclusive of ccSLD's.
On that note, perhaps the TLD validator should be extended to cache the result set in a flat file saved to the PHP temporary directory. The TLD list is updated regularly (usually a few times a week). Although it may be stepping outside the scope of the library, it could be worthwhile as it's the only way to guarantee accuracy without having to maintain the list yourselves.
For my needs i ended up using "layershifter/tld-extract" (which uses the public suffix list) in addition to Rules/Tld. First the TLD is checked from Rules/Tld, then I get possible ccSLD from tld-extract. (tld-extract is also realy helpfull for working with subdomains.)
I think we can't use the public suffix list to validate ccSLD with Rules/Tld. Example: domain.or.at is valid, but domain.org.at is also valid (subdomain of org.at). So if there's a typo within the ccSLD, it is still valid subdomain.
One way might be, that Rules/Domain must not support subdomains. But if we add another Rule for subdomains, we end up with the same problem as described above.
That makes a lot of sense, I appreciate the insight.
I agree Rules/Domain shouldn't support subdomains, on the basis that it is actually a hostname, not a domain. In saying that, if we could go a step further with domain validation and abstract it with flags, or individual validators. Breaking it down by TLD, SLD and hostname.
Relates to https://github.com/Respect/Validation/issues/994
It's still not possible to validate ccSLD against tld() validator, because the list from https://data.iana.org/TLD/tlds-alpha-by-domain.txt (see #1071) does only contain TLD's
@gpredl I think we should just create a wrapper for php-domain-parser and make it an optional dependency.
You're not going to be able to create an up to date list of known ccSLDs, these aren't governed by each respective government. They are under no obligation to publish a list of all supported ccSLDs. Your best bet is to use the library I've linked and validate if the domain is resolvable.
There is the public suffix list which doesn't include all SLDs but at least attempts to list domains "under which Internet users can (or historically could) directly register names."
The php libraries reported using the list are php-domain-parser as mentioned by @CameronHall and TLDExtract.
There was already a commit to include TLDExtract https://github.com/Respect/Validation/pull/1071
I saw TLDExtract uses a helper class... Perhaps it is TLDDatabase which in required?
This commit adds a PublicDomainSuffix rule that should match public ccSLDs.
I hope this is enough to close this almost 5-year issue 🐼 It would be great if you could test drive the 2.3 branch and tell me what you think of it.
I'll close this, but feel free to reopen it once again if you find anything strange with the implementation.
Oops, forgot that I can only close this once 2.3 is released 🐼 😊
Version 2.3 was released about a week ago.
This issue should be fixed now! 🐼