crawl icon indicating copy to clipboard operation
crawl copied to clipboard

Implement capture (custom scraping)

Open benjaminestes opened this issue 7 years ago • 1 comments

benjaminestes avatar Aug 22 '18 14:08 benjaminestes

I think CSS Selectors are the way to go. The content already has to be parsed once to do the scraping internal to the crawler. If we can use CSS Selectors that take as input the net/html tree representation, we won't have to parse the body of each page twice.

benjaminestes avatar Oct 15 '18 15:10 benjaminestes