valibot
valibot copied to clipboard
Feat: add domain() validation action (ASCII domains)
Add domain() validation action (ASCII domains) + docs
Closes https://github.com/fabian-hiller/valibot/issues/1277
Summary
- Introduces
domain()validation action for ASCII domain names - Adds API docs (action + types)
Rationale
/^([a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,}$/;
UPD: Zod regex with small updates https://github.com/fabian-hiller/valibot/pull/1284#issuecomment-3218342138
Decisions
localhostand bare intranet-like hostnames are considered invalid fordomain()(by design)- Regex focuses on label rules and TLD; we do not enforce total domain length (max 253 chars)
Limitations
- Total domain length is not checked
- Unicode or IDN/Punycode is not supported (ASCII TLD required)
Discuss
We can switch to a regex that supports Punycode TLD (and total length) if desired
/^(?=.{1,253}$)(?:[a-z0-9](?:[a-z0-9-]{0,61}[a-z0-9])?\.)+(?:[a-z]{2,}|xn--[a-z0-9-]{2,})$/i
The advantage of this solution is that we can add toPunycode() and use it before domain() for unicode domains.
If we go with a solution that only checks simple cases, we will face the need to add a separate rfcDomain action, as it happens now with email + rfcEmail.
What do you think about this?
The latest updates on your projects. Learn more about Vercel for GitHub.
| Project | Deployment | Preview | Comments | Updated (UTC) |
|---|---|---|---|---|
| valibot | Preview | Comment | Aug 25, 2025 7:57am |
Thank you for this PR! Do "normal" domains only support a single - but Punycode TLDs require two? Are there any other differences?
Do "normal" domains only support a single
-but Punycode TLDs require two? Are there any other differences?
- Top level domains can contain numbers for punycode, but not for regular domains;
- Labels and TLDs start with
xn--; - Normal domains can support double
-for label. The main thing is that it is not at the beginning or end of the label. - Regular TLDs do not contain
-.
The difficulty is that the specification does not have regular expressions. And in the literature, for example, https://www.oreilly.com/library/view/regular-expressions-cookbook/9781449327453/ch08s15.html, fairly primitive regular expressions are presented.
Determining the correct domain is more about procedures, not about patterns. We will not be able to achieve perfect validation, but we can somehow get closer to it.
and just for information, the source of truth for all TLDs: https://data.iana.org/TLD/tlds-alpha-by-domain.txt
Would your recommendation be to introduce domain for "normal" domains and add rfcDomain later on as needed like we did with email and rfcEmail?
Would your recommendation be to introduce
domainfor "normal" domains and addrfcDomainlater on as needed like we did withrfcEmail?
I think we can go ahead with only ASCII checking, which we call "regular" or "normal" domain. Then we can add rfcDomain if someone needs it.
But before that, I think we should make the current zod regular expression a bit stricter:
- Add the
iflag, which will make the regular expression shortera-zA-Z=>a-z - Add a limit on the TLD length
[a-z]{2,}$=>[a-z]{2,63}$, which is standard - Limit the total string length
(?=.{1,253}$), which is standard
Seems pretty safe.
The final result:
/^(?=.{1,253}$)([a-z0-9](?:[a-z0-9-]{0,61}[a-z0-9])?\.)+[a-z]{2,63}$/i;
After that, I will add new tests and the documentation description.
UPD: Done