python-email-validator
python-email-validator copied to clipboard
DNS resolver can run for longer than the given timeout
It's very hard to reproduce, we found the problem because we have a 60s timeout for our web server that we kept busting, even with email-validator's default 15s timeout.
Disabling check_availability solved our problem.
The problem
The problem lies within:
response = dns_resolver.resolve(domain, "MX")
( https://github.com/JoshData/python-email-validator/blob/main/email_validator/deliverability.py#L40 )
While debugging, what I can see is that the resolver goes through a loop, and only checks the timeout between calls. It also does some time.sleep()
That means that if a call or a sleep is longer than the timeout, it doesn't get interrupted and can thus run for longer
The solution
Using signal, we could interrupt the process. See https://stackoverflow.com/a/494273
This package does it: https://github.com/pnpnpn/timeout-decorator
I'll try implementing the stackoverflow suggestion on my end and write back with news
This sounds like it would be a suggestion better made for the dnspython library rather than here but I would be curious to hear what you find out.
Yeah agreed.
Here's the issue in dnspython: https://github.com/rthalley/dnspython/issues/913
I'm testing the signal solution today. Some caveats: it only works on single-thread programs, and could interfere with other code also using signal.SIGALRM
I closed the issue in dnspython. The problem seems to be that in deliverability.validate_email_deliverability, Resolver.resolve is called potentially 4 times with the same timeout (due to except dns.resolver.NoAnswer).
The solutions I see on your side could be:
- Divide timeout in 4 (so each of the 4 potential
resolvecall gets 1/4 of the total timeout) - Measure time left after every
resolvecall, and allocate the remaining time (if any) to the following call a. e.g. first call takes 5s, next call gets a timeout of 15-5 = 10s
For my solution using signal to work, I'll have to raise a BaseException because dnspython is catching Exception
Or a hacky fix would be to give email_validator a timeout of 1/4 of what I actually want
Ah sorry for suggesting you head toward dnspython.
Since the except-block for dns.exception.Timeout is around all of the DNS queries, the first query to time out should end the deliverability checks. So to get a total time that's four times the timeout, each query would have to complete in just under the timeout without timing out.
Your option 2 would be OK for me, but it might be a little tricky to implement because the resolver might be provided by the caller and reducing the timeout in future calls would mangle the original value.