TLDExtract
TLDExtract copied to clipboard
TLDExtract not properly parsing hostname
I'm running some domain names through TLDExtract and came across a domain not being properly parsed.
The URL is called blogspot.com
$url = 'blogspot.com';
$domain = tld_extract($url);
var_dump($domain);
Returns:
object(LayerShifter\TLDExtract\Result)[9]
private 'subdomain' => null
private 'hostname' => string 'blogspot.com' (length=12)
private 'suffix' => null
Weirdly the URL 'flogspot.com' works fine and returns:
object(LayerShifter\TLDExtract\Result)[9]
private 'subdomain' => null
private 'hostname' => string 'flogspot' (length=8)
private 'suffix' => string 'com' (length=3)
The URL logspot.com also works and returns:
object(LayerShifter\TLDExtract\Result)[9]
private 'subdomain' => null
private 'hostname' => string 'logspot' (length=7)
private 'suffix' => string 'com' (length=3)
Any idea why the TLD in 'blogspot.com' is not being added to the suffix? Is this a bug?
I see blogspot.com is in the public_suffix_list.dat. What's going on here? Can't Layershifter parse any of the URL's in that list? Any workarounds?
https://github.com/publicsuffix/list/blob/6f2b9e75eaf65bb75da83677655a59110088ebc5/public_suffix_list.dat#L5884