crawlee
crawlee copied to clipboard
Fail handler timeout is considered as user error and fails the crawler
Which package is this bug report for? If unsure which one to select, leave blank
@crawlee/basic (BasicCrawler)
Issue description
Crawlee - CheerioCrawler crashes with uncaught exception. This looks like that slow platform did stuck error handler and it timeouted and killed the process. Happened twice in a row.
The code is pretty basic:
https://github.com/apify-projects/metadialog/blob/main/get-arabic-domains/src/main.ts#L80 https://console.apify.com/admin/users/5njhzC6bd87wpC39v/actors/QIhv3v1Ax45QvxV1M/runs/j4s57aqm5e4DBi5mr#log
2023-04-24T18:01:20.854Z node:internal/process/esm_loader:97
2023-04-24T18:01:20.856Z internalBinding('errors').triggerUncaughtException(
2023-04-24T18:01:20.858Z ^
2023-04-24T18:01:20.860Z
2023-04-24T18:01:20.861Z TimeoutError: Handling request failure of https://www.bonanzascout.com/index.php/2021/11/11/tips-on-easy-methods-to-play-poker/ (TwPn8VlqqjU0to2) timed out after 300 seconds.
2023-04-24T18:01:20.863Z at Timeout._onTimeout (/usr/src/app/node_modules/@apify/timeout/index.js:62:68)
2023-04-24T18:01:20.865Z at listOnTimeout (node:internal/timers:559:17)
2023-04-24T18:01:20.867Z at processTimers (node:internal/timers:502:7)
https://apifier.slack.com/archives/C0L33UM7Z/p1682412275806569
Code sample
No response
Package version
latest
Node.js version
16
Operating system
No response
Apify platform
- [X] Tick me if you encountered this issue on the Apify platform
I have tested this on the next release
No response
Other context
No response