PyDomainExtractor Ip addresses are parsed incorrectly

Ip addresses are parsed incorrectly

Open nsteinberg-r7 opened this issue 4 years ago • 3 comments

How to reproduce: call extract_from_url with http://127.0.0.1 as input. result will be {subdomain: 127.0.0, domain: 1} expected behavior: throw Invalid Domain Error

Sep 16 '20 09:09 nsteinberg-r7

Technically this is a valid domain. I'm not sure what to do here. Validating the domain here is weird. Ensuring the domain is not an IP gonna be hard here. I think we should tolerate such cases.

Sep 22 '20 12:09 wavenator

It does not parse IP addresses

Tldextract can do this for you.

Dec 28 '22 22:12 vihaanmody1

Technically this is a valid domain. I'm not sure what to do here. Validating the domain here is weird. Ensuring the domain is not an IP gonna be hard here. I think we should tolerate such cases.

From IETF RFC3696, top-level domain names cannot be all numeric (i.e. In the case of http://127.0.0.1, 1 is not a TLD, hence 127.0.0.1 cannot be a fully-qualified domain name (FQDN))

Feb 12 '24 07:02 elliotwutingfeng

PyDomainExtractor PyDomainExtractor copied to clipboard

Ip addresses are parsed incorrectly

PyDomainExtractor
PyDomainExtractor copied to clipboard