scrapyd icon indicating copy to clipboard operation
scrapyd copied to clipboard

Per-project job limit

Open luc4smoreira opened this issue 8 years ago • 6 comments

I developed this new feature to allow limit the maximum jobs per project. Please, check if it is interesting.

luc4smoreira avatar Apr 06 '16 19:04 luc4smoreira

Hi,

What is your use-case for this feature? Do you have problems with projects monopolizing resources? If so, how did this happen? What kind of projects do you run in the same scrapyd instance?

Btw, try running the tests locally before opening a PR instead of waiting for TravisCI.

Digenis avatar Apr 07 '16 09:04 Digenis

Hello Digenis.

I want to use Scrapyd in a production environment. There is a lot of Spiders projects. Some of those, runs eventually (monthly) but take about 3 days to complete all jobs, with about 500 jobs. So, I don´t want to lock the other jobs when this project starts.

I found other users that need this kind of feature too, like this one: https://groups.google.com/forum/#!topic/scrapy-users/FME7PVpD2k8

I will work to fix the tests today, if I have time. And push the code to this branch.

luc4smoreira avatar Apr 07 '16 12:04 luc4smoreira

I will need someone who's been more involved in the poller/launcher to review this when ready.

Digenis avatar Apr 07 '16 13:04 Digenis

Sorry about the mess in Travis history.

I fixed the unit test using mock, with this module: https://pypi.python.org/pypi/mock

I am looking for how to add this egg in travis.

luc4smoreira avatar Apr 07 '16 14:04 luc4smoreira

This PR has severe conflicts. Would any of the contributors be able to resolve them? If not, I will close the PR and create an issue instead (or defer to #197 as suggested in #389).

jpmckinney avatar Sep 24 '21 00:09 jpmckinney

This introduces postgres and rabbitmq as dependencies, will increase technical debt. Also some things added here were already done in simpler way using sqlite here: https://github.com/scrapy/scrapyd/pull/359 and merged.

So in this form this PR cannot be merged.

Ideally we should just allow configurable Pollers, now we load QueuePoller class by default, but we could just make it possible for people to write any sort of complex Pollers themselves. Same for scheduler. I think ScrapyD should be basic and simple, but should provide building blocks to extend it with your desired functionality. This desired functionality from this PR could be added as custom project extension of some specific ScrapyD project, and ScrapyD should just allow people to integrate it easily by making all core components configurable.

pawelmhm avatar Nov 23 '21 06:11 pawelmhm