Rcrawler Extract emails from domain

Extract emails from domain

Open MislavSag opened this issue 5 years ago • 0 comments

I would like to extract e-mails from several domains. Lets say I have only two:

domains <- c("http://www.aldautomotive.hr", "http://www.bks-leasing.hr")

Now, I want to extract only e-mail's from this domains. I know I can save htmls and than extract emails, but is it possible to extract it in one step? If this is e-mail regex:

emailRegex <- "^[a-zA-Z0-9.!#$%&'*+/=?^_{|}~-]+@a-zA-Z0-9?(?:.a-zA-Z0-9?)*$"`

How can I extract only that e-mails, but anything else? It seems KeyWords argument don't accept regex?

Jul 29 '19 10:07 MislavSag

Rcrawler Rcrawler copied to clipboard

Extract emails from domain

Rcrawler
Rcrawler copied to clipboard