crawlee
crawlee copied to clipboard
Unify timeouts throughout our classes
PuppeteerCrawler
gotoFunctionhas a constant timeout inside, which can be overridden by overriding the function.handlePageFunctionhas its own timeout.
CheerioCrawler
prepareRequestFunctiondoes not have a timeout.handlePageFunctionhas its own timeout.
BasicCrawler
handleRequestFunctionhas its own timeout. When usingPuppeteerorCheerio, the timeout is set to a multiple of theirhandlePageFunction.handleFailedRequestFunctiondoes not have a timeout.
AutoscaledPool
- has no timeouts.
PuppeteerPool
- has
puppeteerOperationsTimeoutSecsfor puppeteer related stuff.
It's a mess.