pyspider
pyspider copied to clipboard
A Powerful Spider(Web Crawler) System in Python.
Hi, as shown in the following full dependency graph of **_pyspider_**, pyspider requires _**chardet**_ (the latest version), while the installed version of **_requests_**(2.22.0) requires _**chardet>=3.0.2,
Dear all, I want to mount scripts to local folder. I've installed davfs2, but when I ran the mount command, it requires username and password. I didn't set up any...
When I install pyspider,the following error occurred. Complete output from command python setup.py egg_info: Using curl-config (libcurl 7.54.0) Traceback (most recent call last): File "", line 1, in File "/private/tmp/pip-install-Ei7TZj/pycurl/setup.py",...
{ "taskdb": "elasticsearch+taskdb://host:9200/?index=taskdb", "projectdb": "elasticsearch+projectdb://host:9200/?index=projectdb", "resultdb": "elasticsearch+resultdb://host:9200/?index=resultdb", "webui": { "username": "admin", "password": "111111", "need-auth": true } } how to configure elasticsearch cluster ?
* pyspider version: `pyspider --version` `pyspider, version 0.3.10` * Operating system: `lsb_release -v` `LSB Version: core-9.20160110ubuntu0.2-amd64:core-9.20160110ubuntu0.2-noarch:security-9.20160110ubuntu0.2-amd64:security-9.20160110ubuntu0.2-noarch` * Start up command: `pyspider -c config.json` ### Expected behavior Show a completely web...
任务正常在跑,但是没返回结果:    似乎没有运行 detail_page. `#!/usr/bin/env python # -*- encoding: utf-8 -*- # Created on 2019-05-19 11:20:28 # Project: test from pyspider.libs.base_handler import * class Handler(BaseHandler): crawl_config = {...
* pyspider version: * Operating system: * Start up command: ### Expected behavior ### Actual behavior ### How to reproduce
Dears, Could you please advise how to advoid error 'Timeout before first response.'? I've already set 'connect_timeout' : 60 in crawl_config, but this error occur less than 60 second. Thanks...
* pyspider version: * Operating system: * Start up command: ### Expected behavior ### Actual behavior ### How to reproduce