scrape icon indicating copy to clipboard operation
scrape copied to clipboard

How do we name the generated index the same as the domain

Open wecodelaravel opened this issue 4 years ago • 1 comments

for instance scrape www.google.com > google.html not PART01.html

Also do you have an example of how to use the attributes command?

wecodelaravel avatar Mar 12 '20 20:03 wecodelaravel

There is currently no option to do what you're describing, the current behavior is to generate a directory bearing the domain name which is then populated with PART.html files scraped from that domain.

I'm open to suggestions and pull requests which may alter this mechanism.

The --attributes option is essentially a reduced version of the --xpath option, per the README:

  • If you only want to specify specific tag attributes to extract rather than an entire XPath, use --attributes. The default choice is to extract only text attributes, but you can specify one or many different attributes (such as href, src, title, or any attribute available..).

Specifying --attributes href for example would retrieve only the contents of all the href HTML attributes. This flag cannot be used in conjunction with storing HTML output, however. You can test it using any of the other output options (e.g. print, test, csv, pdf).

huntrar avatar Mar 12 '20 21:03 huntrar