python-string-utils icon indicating copy to clipboard operation
python-string-utils copied to clipboard

Email validation- constraints on domain label and presence of unicode unhandled

Open devikasondhi opened this issue 5 years ago • 1 comments

Hello,

I'm listing some scenarios where the is_email fails:

  1. domain with localhost not accepted by is_email: email@localhost, email@[127.0.0.1] are valid while the function returns False
  2. unicode not handled- this should be valid but returns false: [email protected].\\xe0\\xa4\\x89\\xe0\\xa4\\xa6\\xe0\\xa4\\xbe\\xe0\\xa4\\xb9\\xe0\\xa4\\xb0\\xe0\\xa4\\xa3.\\xe0\\xa4\\xaa\\xe0\\xa4\\xb0\\xe0\\xa5\\x80\\xe0\\xa4\\x95\\xe0\\xa5\\x8d\\xe0\\xa4\\xb7\\xe0\\xa4\\xbe
  3. domain labels can't begin or end in hyphens '-': These should be invalid but is_email gives true: [email protected] and [email protected]

devikasondhi avatar Apr 12 '20 11:04 devikasondhi

Also, local part can contain ascii characters like !'/ (https://en.wikipedia.org/wiki/Email_address). This is not handled well by is_email for input joe!/[email protected]. Further, is_email gives False for input [email protected]. It seems there is a limit on the length of last domain label (not accepting longer than 4).

devikasondhi avatar Apr 12 '20 12:04 devikasondhi