email_verifier
email_verifier copied to clipboard
Allow using proxy to prevent greylist / blacklist / block list?
First, thank you for the library!
Wrote a program recently to verify quite a large list of addresses (400k+), using this gem. The program runs multiple threads to run through the list more quickly. However it seems with some email servers — AOL/AIM, and ALL Microsoft domains (Hotmail, MSN, Live.com) — they've set some sort of unknown timeout/throttling.
I've managed to get myself blocked from the Microsoft email family of servers, as EmailVerifier is giving me this repeatedly:
550 SC-002 (COL004-MC5F35) Unfortunately, messages from [MY IP] weren't sent.
Please contact your Internet service provider since part of their network is on our block list.
You can also refer your provider to http://mail.live.com/mail/troubleshooting.aspx#errors.
I think a possible way around this would be to round-robin each request through a list of SOCKS proxy IPs, but see no possible way to do that with Net::SMTP.
Anyone else of the mind that this should be a possibility, using a service like ProxyBonanza?
FWIW, this is Outlook.com's reason why they're blocking the program...
There are indications that the above IP(s) are engaged in namespace mining. Outlook.com is blocking all email sent from this IP.
Namespace mining is a method commonly used by malicious senders to generate lists of email addresses. This approach uses automation to sift through possible email names seeking to identify valid email addresses, e.g., [email protected], [email protected], and [email protected].
You must correct the problem and you must stop the behavior described above. We recommend that you work with your email or network administrator to review the logs of your email servers. Check the sending log files and concentrate on logs that are sending to nonexistent @hotmail.com, @live.com, @msn.com, @outlook.com?with multiple tries and failures.
How much can you do that before they start blocking you?
I never found a definitive metric. Seems time-based and probably they keep a history of which IPs are doing the verification. I solved it by using more proxies and increasing wait timeouts.
Thanks!
I never found a definitive metric. Seems time-based and probably they keep a history of which IPs are doing the verification. I solved it by using more proxies and increasing wait timeouts.
Could you expand on how you did this? I need to verify a similar sized database and some tips/example on how to avoid this problem would be much appreciated.
Hi @ctrlventure! I wrote a blog article on the subject years back with links to github repos. It's been quite some time since I've been working on this problem, and things might have changed. YMMV. http://subimage.com/blog/2016/10/16/verify-huge-email-lists-for-free-with-ruby/