node-sitemap-stream-parser
node-sitemap-stream-parser copied to clipboard
[IMP]: Respectation of robots.txt
I found an example https://booking.com/robots.txt where sitemaps are marked as Disallowed
Sitemap: https://www.booking.com/sitembk-index-https.xml`
User-agent: Baiduspider
Disallow: /sitembk-index-https.xml
I suggest to add an option respectRobotsTxt to the parser which is true by default.
That seems fair.. Will also make the user-agent configurable at the same time. Thanks for the suggestion!