scrapy-splash icon indicating copy to clipboard operation
scrapy-splash copied to clipboard

Explain how to use scrapy-splash with AutoThrottle

Open kmike opened this issue 7 years ago • 3 comments

AutoThrottle extension doesn't play nicely with scrapy-splash because it thinks requests take a very long time, and adjusts request rate accordingly.

kmike avatar Jul 13 '16 22:07 kmike

Should we simply state that it should be disabled as part of the configuration instructions?

Gallaecio avatar Nov 26 '19 11:11 Gallaecio

There are ways to make it work with AutoThrottle in a more reasonable way, e.g. https://github.com/TeamHG-Memex/undercrawler/blob/master/undercrawler/middleware/throttle.py.

As a first step - yes, it makes sense to at least document this problem. For example, as I recall, Autothrottle is enabled by default on Scrapy Cloud (is it still on by default?).

kmike avatar Nov 26 '19 18:11 kmike

What if we add something like https://github.com/TeamHG-Memex/undercrawler/blob/master/undercrawler/middleware/throttle.py to scrapy-splash itself?

In addition to documenting its (optional) usage, we could log a warning if Scrapy’s built-in AutoThrottle is used along with scrapy-splash.

Gallaecio avatar Dec 13 '19 11:12 Gallaecio