crawlee icon indicating copy to clipboard operation
crawlee copied to clipboard

Consider auto-failing pages that have 5xx HTTP status code

Open jancurn opened this issue 5 years ago • 1 comments

Currently, these pages are not considered failed, and thus not retried. On the other hand, Cheerio Scraper retries them. We should probably consider 5xx errors as failures and retry.

jancurn avatar Mar 06 '20 10:03 jancurn

Hello @jancurn, how about if we set an array of error codes to consider in the sessionPoolOptions?

Like this:

sessionPoolOptions: { maxPoolSize: 100, errorStatusCode: [401, 403, 429, 500] // optional }

Thank you!

fazio91 avatar Jul 27 '20 12:07 fazio91

You can already set the blocked status code via sessionPoolOptions in crawlee:

sessionPoolOptions: { blockedStatusCodes: [401, 403, 429, 500] }

B4nan avatar Sep 14 '22 11:09 B4nan