valid-url
valid-url copied to clipboard
is_iri should allow UTF-8 characters
Currently is_iri
fails on any characters, which don't match this regex:
https://github.com/ogt/valid-url/blob/8d1fc52b21ceab99b68f415838035859b7237949/index.js#L28
Example:
t.ok(is_uri('http://localhost/ä'), 'http://localhost/ä');
As far as I understand the main difference between URI and IRI is:
IRIs extend URIs by using the Universal Character Set, where URIs were limited to ASCII, with far fewer characters.
https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier
Therefore is_iri
should allow all UTF-8 characters and these exports aren't correct:
module.exports.is_uri = is_iri;
module.exports.isUri = is_iri;
Instead there should be two exports: is_iri
and is_uri
.
I saw there wasn't any commit to this repo since 2015, but since this affects a Gatsby issue I was wondering, what you plans with the module are? Do you think about handing it over, @ogt?
The module is used heavily according to npms download numbers, so it might be in the interest of the community to give it some 💌 .
Hi @gustavpursche,
I was facing the same issues with valid-url
but also validator
so I decided to build a module as reliable as possible strictly based on RFC-3986: https://github.com/adrienv1520/node-uri
The main features of this project are:
- parse any URI (URNs, URLs, URIs with IDNs support, etc.);
- get the safe Punycode ASCII or Unicode serialization of a domain;
- check an URI, HTTP/HTTPS/Sitemap URL, IP, domain is valid with clear checking errors;
- encode/decode an URI, HTTP/HTTPS/Sitemap URL.
I hope it could help you.