crawly
crawly copied to clipboard
[Feat] Support auto close spider when all requests finished
It seems that when all the workers’ requests list are empty, the crawler still cannot stop automatically.
Although closespider_timeout can solve some scenarios, there is a new problem of ending early when the network environment is not good.
Hey @EdmondFrank .
It's not quite clear if this approach is going to solve the issue. But still, what is the problem of having just the closespider timeout
Hey @EdmondFrank .
It's not quite clear if this approach is going to solve the issue. But still, what is the problem of having just the closespider timeout
Now in the process of using crawly, I encountered two problems.
First, i need develop some slow crawler, the request frequency is about 1 request/60~90s.
Second, some of the websites I crawl are not very stable, sometimes experiencing a denial of service for a few minutes before returning to normal.
In the above two scenarios, sometimes closespider timeout
will be 0/min , but all requests have not been crawled yet
This has been merged in master on September 14th. The same day Crawly 0.14 has been released. But it seems this features is not part of 0.14 release. Was it intentional?