pyspider icon indicating copy to clipboard operation
pyspider copied to clipboard

A Powerful Spider(Web Crawler) System in Python.

Results 113 pyspider issues
Sort by recently updated
recently updated
newest added

Hi, as shown in the following full dependency graph of **_pyspider_**, pyspider requires _**chardet**_ (the latest version), while the installed version of **_requests_**(2.22.0) requires _**chardet>=3.0.2,

Dear all, I want to mount scripts to local folder. I've installed davfs2, but when I ran the mount command, it requires username and password. I didn't set up any...

When I install pyspider,the following error occurred. Complete output from command python setup.py egg_info: Using curl-config (libcurl 7.54.0) Traceback (most recent call last): File "", line 1, in File "/private/tmp/pip-install-Ei7TZj/pycurl/setup.py",...

{ "taskdb": "elasticsearch+taskdb://host:9200/?index=taskdb", "projectdb": "elasticsearch+projectdb://host:9200/?index=projectdb", "resultdb": "elasticsearch+resultdb://host:9200/?index=resultdb", "webui": { "username": "admin", "password": "111111", "need-auth": true } } how to configure elasticsearch cluster ?

* pyspider version: `pyspider --version` `pyspider, version 0.3.10` * Operating system: `lsb_release -v` `LSB Version: core-9.20160110ubuntu0.2-amd64:core-9.20160110ubuntu0.2-noarch:security-9.20160110ubuntu0.2-amd64:security-9.20160110ubuntu0.2-noarch` * Start up command: `pyspider -c config.json` ### Expected behavior Show a completely web...

任务正常在跑,但是没返回结果: ![image](https://user-images.githubusercontent.com/3387095/57977346-33a1cb80-7a29-11e9-9841-d60b98628154.png) ![image](https://user-images.githubusercontent.com/3387095/57977347-3bfa0680-7a29-11e9-92ea-64d8a816ae82.png) ![image](https://user-images.githubusercontent.com/3387095/57977349-546a2100-7a29-11e9-9df2-8a415c8fb7fe.png) 似乎没有运行 detail_page. `#!/usr/bin/env python # -*- encoding: utf-8 -*- # Created on 2019-05-19 11:20:28 # Project: test from pyspider.libs.base_handler import * class Handler(BaseHandler): crawl_config = {...

* pyspider version: * Operating system: * Start up command: ### Expected behavior ### Actual behavior ### How to reproduce

Dears, Could you please advise how to advoid error 'Timeout before first response.'? I've already set 'connect_timeout' : 60 in crawl_config, but this error occur less than 60 second. Thanks...

* pyspider version: * Operating system: * Start up command: ### Expected behavior ### Actual behavior ### How to reproduce