psl icon indicating copy to clipboard operation
psl copied to clipboard

Domain name validation is not correct according to RFC 2181

Open aviv1ron1 opened this issue 6 years ago • 3 comments

psl validates the regular expression /^[a-z0-9-]+$/ and returns 'LABEL_INVALID_CHARS' if not valid. This is wrong according to rfc 2181 section 11 Name Syntax:

The DNS itself places only one restriction on the particular labels that can be used to identify resource records. That one restriction relates to the length of the label and the full name....Implementations of the DNS protocols must not place any restrictions on the labels that can be used

The validation of LABEL_INVALID_CHARS should be removed

aviv1ron1 avatar Jun 04 '18 08:06 aviv1ron1

"_" should be allowed

DaveRingelnatz avatar Dec 13 '18 09:12 DaveRingelnatz

Yup, this thing is not working properly :T

code: "LABEL_INVALID_CHARS" message: "Domain name label can only contain alphanumeric characters or dashes." input: "https://theintercept.com/2019/06/02/samuelpinheiroentrevista/"

🤷🏼‍♂️

woops my bad, didn't realize I have to remove the protocol

ploissken avatar Jun 06 '19 19:06 ploissken

Hi @aviv1ron1, many thanks for reporting this and apologies for the delay in getting back to you... :see_no_evil:

I think you are right with regards to the formal definition of domain name labels as described in the RFC. I think it is my bad that I based the implementation on the description of hostnames on Wikipedia... :cold_sweat:

Hostnames impose restrictions on the characters allowed in the corresponding domain name. A valid hostname is also a valid domain name, but a valid domain name may not necessarily be valid as a hostname. Source: https://en.wikipedia.org/wiki/Domain_name

I would like to correct this, but at the same time I don't want to introduce breaking changes in the short term. I will soon start working on a re-write of this module (v2 - with braking changes), and will probably remove the regex validation when validating domains, and offer a different, and more explicit, mechanism for validating hostnames. I will update this issue as soon as I start work on v2.


@DaveRingelnatz: considering @aviv1ron1's point, I also think you are right. Please stay tuned for v2, which will hopefully solve these issues.


@ploissken: Please note that you are expected to pass a domain name (ie: theintercept.com) and not a full URL (like https://theintercept.com/2019/06/02/samuelpinheiroentrevista/).

lupomontero avatar Jun 20 '19 00:06 lupomontero