pydantic-core icon indicating copy to clipboard operation
pydantic-core copied to clipboard

`EmailValidator` type

Open samuelcolvin opened this issue 2 years ago • 13 comments

Using mail_parser I guess.

Rationale:

  • remove the email-validator optional dependency from pydantic, and its descendant dependencies - dnspython and idna
  • should be faster
  • the NameEmail "My Name <[email protected]>" parsing can be improved, see https://github.com/pydantic/pydantic/issues/3173 - rust is much better for this kind of parsing than python (actually looks like mail_parser supports this out of the box)

We just need to make sure that adding mail_parser dependency doesn't significantly increase binary size - any one working on this. I encourage you to do a very simple POC implementation, create a PR and ask us to add the "Full Build" lable to see how much binary size has increased before completing the PR, in case we have to abandon the idea due to this.

To be clear we need two validators EmailValidator and NameEmailValidator which will be represented by EmailStr and NameEmail in pydantic.

samuelcolvin avatar Feb 06 '23 10:02 samuelcolvin

What do you think of using or having something like: https://github.com/Keats/validator for pydantic? It includes email and credit card and some others

woile avatar Feb 06 '23 10:02 woile

@samuelcolvin I think we are trying to solve two issues into one.

I think it is better to do it as a combination of

validation = validate(quotedform)
quotedform = validate(quotedpart) + SPACE + validate(email)

Like the Email Validator validates an email. The quoted form is not an email.

We can add another QuotedEmailValidator

Abdur-rahmaanJ avatar Feb 06 '23 11:02 Abdur-rahmaanJ

What do you think of using or having something like: https://github.com/Keats/validator for pydantic? It includes email and credit card and some others

Not really required, we have our own validator system which is designed to be used from python code - remember we don't know the exact schema for validation until runtime.


@Abdur-rahmaanJ I agree that NameEmailValidator should use the same logic internally to validate the actual email address.

samuelcolvin avatar Feb 06 '23 16:02 samuelcolvin

I'll be trying to create something basic, will update

alonme avatar Feb 11 '23 20:02 alonme

I took a look at smtp-proto - and it seems that we may use it for addr-spec, But AFAICT, the whole validation code is around 100 LOC, and it requires a \n at the end of the string.

So if this is all the code that is needed (I expected it to be more complex) it may be worth implementing this on our own

[package]
name = "mail-testing"
version = "0.1.0"
edition = "2021"

[dependencies]
smtp-proto = "0.1.1"
use smtp_proto::request::parser::Rfc5321Parser;

fn main() {
    let mut input = "[email protected]\n".as_bytes().into_iter();

    let mut parser = Rfc5321Parser::new(&mut input);
    println!("{:?}", parser.address());
}

alonme avatar Feb 16 '23 22:02 alonme

If it's only 100 LOC I would be in favor of just vendoring it, which I take to be your suggestion. If it appears to be thorough/correct, I don't expect it will be much of a maintenance burden on ourselves, it's not like the spec is changing. I imagine the small amount of effort it would take to integrate a fix if a bug was found would probably be worth the elimination of the dependency. Open to disagreement.

dmontagu avatar Feb 17 '23 17:02 dmontagu

Just getting stuck into rust, curious why something like https://docs.rs/email_address/latest/email_address/ wouldnt suit?

ghandic avatar Feb 27 '23 06:02 ghandic

I think it looks great. Not many stars, but that doesn't matter much necessarily.

The first two things to check are:

  1. Good much do the binaries increase in size.
  2. How many of email-validator's (the python package) tests pass with this library

If it's good on those two, it's golden.

samuelcolvin avatar Feb 27 '23 09:02 samuelcolvin

Moving into here as more relevant for our discussion...

use smtp_proto::request::parser::Rfc5321Parser;

fn main() {
    let email: &str = "Art Vandelay <[email protected]>\n";
    let email2: &str = "[email protected]\n";

    let addr = Rfc5321Parser::new(&mut email.as_bytes().iter()).address();
    let addr2 = Rfc5321Parser::new(&mut email2.as_bytes().iter()).address();

    println!("{:?}", addr);
    println!("{:?}", addr2);
}

Outputs

Ok(None)
Ok(Some("[email protected]"))

Looks like we might need a blended approach as neither lib has exactly what we want, assuming we want to:

  • validate the email
  • Extract sections from the email
    • Display Name: Art Vandelay
    • email: [email protected]
    • local_part: art
    • domain: vandelay.com

ghandic avatar Feb 28 '23 06:02 ghandic

Added a PR to the email_address repo, see if its accepted, then we could have the above working https://github.com/johnstonskj/rust-email_address/pull/15

ghandic avatar Feb 28 '23 08:02 ghandic

Using the above...

use email_address::EmailAddress;

fn main() {
    let email = "[email protected]";
    let addr = EmailAddress::from_str(email).unwrap();
    println!("Existing example");
    println!("-----------------");
    println!("{}", addr);
    println!("");

    let email2 = "Art Vandelay <[email protected]>";
    let addr2 = EmailAddress::from_str(email2).unwrap();
    println!("New example");
    println!("-----------------");
    println!("{}", addr2);
    println!("{}", addr2.email());
    println!("{} <{}@{}>", addr2.display_part(), addr2.local_part(), addr2.domain());
}

Outputs

Existing example
-----------------
[email protected]

New example
-----------------
Art Vandelay <[email protected]>
[email protected]

ghandic avatar Feb 28 '23 08:02 ghandic

Per https://github.com/pydantic/pydantic-core/pull/413#issuecomment-1450065258 it looks like you may be sticking with the Python library for email validation?

zanieb avatar Apr 23 '23 19:04 zanieb

Indeed, given the above, seems like we'll probably stick with email-validator. I'll leave this open as a potential performance enhancement, though it's certainly not high priority

sydney-runkle avatar Aug 17 '24 01:08 sydney-runkle