scrapy-zyte-smartproxy icon indicating copy to clipboard operation
scrapy-zyte-smartproxy copied to clipboard

Blacklist domains

Open whalebot-helmsman opened this issue 3 years ago • 1 comments

I was setuping autoextract in scrapy cloud on a project with crawlera addon. Autoextract queries were routed through crawlera. Idea is to blacklist autoextract domain by default. It may have sense for other services, e.g. spalsh.

It is possible to implement this without adding new options, e.g. adding something to https://github.com/scrapy-plugins/scrapy-crawlera/blob/019987f68345079db176405c9f9fbb155ee26f20/scrapy_crawlera/middleware.py#L32

whalebot-helmsman avatar Feb 10 '21 09:02 whalebot-helmsman

I would also log a warning for the first time it happens during a crawl.

Gallaecio avatar Feb 10 '21 10:02 Gallaecio