spidermon Action to restart spider in Scrapy Cloud

Action to restart spider in Scrapy Cloud

Open andrewbaxter opened this issue 6 years ago • 4 comments

Discussed in chat a bit - the idea is that if a job meets some conditions (monitor detects certain website responses, the job stalls, etc) this action could restart the job.

Ideas for how to count restarts to prevent infinite restarting included:

job metadata
spider parameter
tag the job + restarts with a uuid or something and look up the previous job(s) with the tag to get the restart count

Mar 14 '19 19:03 andrewbaxter

Should we create this coupling of Spidermon and Scrapy cloud?

Mar 18 '19 12:03 ejulio

@rennerocha said he didn't see a problem so I made this ticket, but I may have misread his response. I have no strong opinions either way. Since it's a common use case at the moment I think it would be more convenient but I can understand wanting to limit the amount of dependencies.

Mar 18 '19 13:03 andrewbaxter

This could work out well but what exactly is the use case here? Something like restarting after a particular amount of time in specific circumstances?

Apr 06 '19 02:04 vipulgupta2048

I'm not sure there's a specific use case, but this comes up every once in a while. Sometimes jobs get stuck for some reason and restarting them gets them going again, or the website upstream starts having issues which corrupt the data and it's not worth the effort of trying to recover.

Apr 08 '19 15:04 andrewbaxter

spidermon spidermon copied to clipboard

Action to restart spider in Scrapy Cloud

spidermon
spidermon copied to clipboard