pyspider icon indicating copy to clipboard operation
pyspider copied to clipboard

A Powerful Spider(Web Crawler) System in Python.

Results 113 pyspider issues
Sort by recently updated
recently updated
newest added

I'm trying to use the puppeteer fetcher with this script from the examples: ``` from pyspider.libs.base_handler import * class Handler(BaseHandler): def on_start(self): self.crawl('http://www.twitch.tv/directory/game/Dota%202', fetch_type='chrome', callback=self.index_page) def index_page(self, response): return {...

* pyspider version: * Operating system: * Start up command: ### Expected behavior ### Actual behavior ### How to reproduce

但是在web控制台页面上端看不到有scheduler、fetcher、processor被阻塞。 所以这算是bug吗?

* pyspider version: * Operating system: * Start up command: ### Expected behavior ### Actual behavior ### How to reproduce

* pyspider version:0.3.10 * Operating system:Win10 64Bit * Start up command:pyspider all Dear All: When I crawl a website, the error comes ` [E 191205 16:21:21 base_handler:203] netloc '|file|中英双字.rmvb|' contains...

pyspider能爬取vue和react框架嘛,怎么实现呢

I'm trying to replicate the deployment demo setup from here: [http://docs.pyspider.org/en/latest/Deployment-demo.pyspider.org/](http://docs.pyspider.org/en/latest/Deployment-demo.pyspider.org/) but I'm getting these errors at the nginx volumes lines: ``` Starting pyspider_nginx_1 ... error ERROR: for pyspider_nginx_1 Cannot...

Hi there, I just wonder if this project is still alive. Since I found that the latest release was back at April 2018 and there were some issues related to...

* pyspider version: 0.3.10 * Operating system: Ubuntu 18.04.2 LTS * Start up command: pyspider all 举个例子: ``` def on_start(self): ... val = 890984766742986795 self.crawl(some_url, callback=self.topic_list_page, save={'val': val}) def topic_list_page(self,...