node-sitemap-stream-parser icon indicating copy to clipboard operation
node-sitemap-stream-parser copied to clipboard

[IMP]: Respectation of robots.txt

Open YarnSeemannsgarn opened this issue 6 years ago • 1 comments

I found an example https://booking.com/robots.txt where sitemaps are marked as Disallowed

Sitemap: https://www.booking.com/sitembk-index-https.xml`

User-agent: Baiduspider
Disallow: /sitembk-index-https.xml

I suggest to add an option respectRobotsTxt to the parser which is true by default.

YarnSeemannsgarn avatar Jul 04 '18 06:07 YarnSeemannsgarn

That seems fair.. Will also make the user-agent configurable at the same time. Thanks for the suggestion!

evanderkoogh avatar Jul 05 '18 01:07 evanderkoogh