standards-and-practices icon indicating copy to clipboard operation
standards-and-practices copied to clipboard

Create a standard email field verification Regular Expression (or find and verify one)

Open coreyshuman opened this issue 6 years ago • 15 comments

https://en.wikipedia.org/wiki/Email_address

There are some crazy email addresses allowed in RFC 5321 and RFC 5322. Here is the above articles set of rules, and examples of valid and invalid addresses.

Local-part

The local-part of the email address may use any of these [[ASCII]] characters:

  • uppercase and lowercase [[Basic Latin (Unicode block)|Latin]] letters A to Z and a to z;

  • digits 0 to 9;

  • special characters !#$%&'*+-/=?^_`{|}~;

  • dot ., provided that it is not the first or last character unless quoted, and provided also that it does not appear consecutively unless quoted (e.g. [email protected] is not allowed but "John..Doe"@example.com is allowed);

Note that some mail servers wildcard local parts, typically the characters following a plus and less often the characters following a minus, so fred+bah@domain and fred+foo@domain might end up in the same inbox as fred+@domain or even as fred@domain. This can be useful for tagging emails for sorting, see below, and for spam control. Braces { and } are also used in that fashion, although less often.

  • space and "(),:;<>@[] characters are allowed with restrictions (they are only allowed inside a quoted string, as described in the paragraph below, and in addition, a backslash or double-quote must be preceded by a backslash);
  • comments are allowed with parentheses at either end of the local-part; e.g. john.smith(comment)@example.com and (comment)[email protected] are both equivalent to [email protected].

In addition to the above ASCII characters, international characters above U+007F, encoded as [[UTF-8]], are permitted by RFC 6531, though even mail systems that support SMTPUTF8 and 8BITMIME may restrict which characters to use when assigning local-parts.

Domain

The [[domain name]] part of an email address has to conform to strict guidelines: it must match the requirements for a [[hostname]], a list of dot-separated [[DNS]] labels, each label being limited to a length of 63 characters and consisting of:{{rp|§2}}

  • uppercase and lowercase [[Basic Latin (Unicode block)|Latin]] letters A to Z and a to z;
  • digits 0 to 9, provided that top-level domain names are not all-numeric;
  • hyphen -, provided that it is not the first or last character. This rule is known as the ''LDH rule'' (letters, digits, hyphen). In addition, the domain may be an [[IP address]] literal, surrounded by square brackets [], such as jsmith@[192.168.2.1] or jsmith@[IPv6:2001:db8::1], although this is rarely seen except in [[email spam]]. [[Internationalized domain name]]s (which are encoded to comply with the requirements for a [[hostname]]) allow for presentation of non-ASCII domains. In mail systems compliant with RFC 6531 and RFC 6532 an email address may be encoded as [[UTF-8]], both a local-part as well as a domain name.

Comments are allowed in the domain as well as in the local-part; for example, john.smith@(comment)example.com and [email protected](comment) are equivalent to [email protected].

Examples

Valid email addresses

Invalid email addresses

  • Abc.example.com (no @ character)
  • A@b@[email protected] (only one @ is allowed outside quotation marks)
  • a"b(c)d,e:f;gi[j\k][email protected] (none of the special characters in this local-part are allowed outside quotation marks)
  • just"not"[email protected] (quoted strings must be dot separated or the only element making up the local-part)
  • this is"not\[email protected] (spaces, quotes, and backslashes may only exist when within quoted strings and preceded by a backslash)
  • this\ still"not\[email protected] (even if escaped (preceded by a backslash), spaces, quotes, and backslashes must still be contained by quotes)
  • 1234567890123456789012345678901234567890123456789012345678901234+x@example.com (local part is longer than 64 characters)
  • [email protected] (double dot before @)
  • [email protected] (double dot after @)

coreyshuman avatar Jul 25 '18 22:07 coreyshuman