pyspider icon indicating copy to clipboard operation
pyspider copied to clipboard

A Powerful Spider(Web Crawler) System in Python.

Results 113 pyspider issues
Sort by recently updated
recently updated
newest added

* pyspider version:0.3.10 * Operating system:win10 * Start up command:pyspider all why the results have a column named '...'? what is it used for? can I delete it? 为什么结果输出里,最后有一列空列,名字叫"..",有什么作用? 我可以删掉吗?...

* pyspider version: 0.3.10 * Operating system:win10 * * Start up command: ### Expected behavior command line executed sucessfully ### Actual behavior (base) C:\ProgramData\Anaconda3\Scripts>pyspider [all] Traceback (most recent call last):...

## 🐛 Bug Report Script content needs security check which could cause RCE ## To Reproduce 1. start up a pyspider-server. 2. access the task webpage. 3. upload a task...

from pyspider.libs.base_handler import * class Handler(BaseHandler): crawl_config = { } @every(minutes=24 * 60) def on_start(self): self.crawl('https://movie.douban.com/explore#!type=movie&tag=%E7%83%AD%E9%97%A8&sort=recommend&page_limit=20&page_start=0', fetch_type='js', js_script=""" function() { setTimeout("$('.more').click();console.log('finish');", 2000); }""", callback=self.phantomjs_parser) @config(age=10 * 24 * 60 *...

* pyspider version: latest * Operating system:centos 7 * Start up command: sudo docker run --network=pyspider --name scheduler -d -p 23333:23333 --restart=always binux/pyspider --taskdb "mysql+taskdb://root:[email protected]:3306/taskdb" --resultdb "mysql+projectdb://root:[email protected]:3306/resultdb" --projectdb "mysql+projectdb://root:[email protected]:3306/projectdb" --message-queue...

I have an error: "ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts....

* pyspider version: 0.3.10 * Operating system: macOS BigSur * Start up command: env LC_ALL=en_US.utf-8 LANG=en_US.utf-8 pyspider --version ### Expected behavior I'm quite new to this framework how can I...

### Expected behavior I want to add a header with empty value, for example, my request requires a header 'Authorization: ' which is a 'Authorization' header valued as ''. `url...