有几种报错,我不太明白,您能帮我看一下吗?
1、
Traceback (most recent call last):
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/utils/defer.py", line 102, in iter_errback
yield next(it)
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/offsite.py", line 30, in process_spider_output
for x in result:
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/referer.py", line 339, in
return (_set_referer(r) for r in result or ())
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/urllength.py", line 37, in
return (r for r in result or () if _filter(r))
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/depth.py", line 58, in
return (r for r in result or () if _filter(r))
File "/Users/FiveMeter/Desktop/kaoputou-project/wenshu_monitor/Wenshu/spiders/wenshu.py", line 118, in get_docid
result = eval(json.loads(html))
File "", line 1
["RunEval":"w61aSW7Cg0AQfAvClg/Cg8KIw7IBw6TCk8KfwpDDowhZFnZiDjHDkcKYwpwsw789QCwCw4wCwoTCrRlKQmUxa3V3dQ8gby/DkcOpfAtFw7TClcOsw54SEV0/XsOfRcO8wrnCvxzDhT4+wp3CmcOneDwAwpDChhc4AcKwDsKFw5hgw4hKw5IVVQnCsQXDgMOvw7AsAMORwoPDhxBcw7gTwo7CgcOBwoDDlUcAwrrCgyvDoUXDmA9AaBA4AMKiAj/DgTjCmATCgBBswrYGADkBwqApAAA6wrPDnC4AVXB9w7/CsMObwoTDscO1wpbCiMOvMMKJw4XDhj/DsEPCkF7CjHnDt0c2wqzCuMOuD8KXw7/DjsOkKQTCgcOHwpzCvMODw6XDlTXCs8KOw6bCnsO8dsO7w7fCl2vDjsKyw6Z0BHfCsh4ew7DDtMKnwrZhdcK4w7PCnMKgw5/CrMOGWznDlsOIwqnCvMKawrrCvcOYKmctwrtIwqLCoMK0w4HCsHZBwrLCnUdLwrfCjcK0W8KOw6gywqzDs8OYw79Nw6gxwqvDr8OUQcOmD8K3SFUecsKmw6YXwqslw71wOw/DqcKNUsKuwpjDtMOdTcOeM0VEYVfCpcKiwqXCrUV5NcOWHcKSNk3CtXrCjy3CmkrDryhUJ8OdR2PCpsOqwqwcwpnDhMO0ZmscUG7DlcKdwpdTwolAOsKIw5U8Ryk4w7XCnU3DjwTDj8OHwq7CpcOJZSbCuTXDrz0nwrHCgQnDkD5dF3DDtsOTYGEsCFQMb2nCvcKqwq7CqsOrbMO9ccKrw5paXsKQw7d4RSPDpDzDsMORw6teLyLCg8K8wqzDukBBSxbCusOUwpVywrvDvTfCqMK+dlp0wpzCkMKMNjVmw6ZEwpYrwocpwo8nPhtQw6TCu0Z6w4zCj0ZrdMOpw6wcwpfDoMKfZsKiFsK9K8KOwrswwrnCisOnMsOXw78B",]
^
SyntaxError: invalid syntax
2、
Traceback (most recent call last):
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/utils/defer.py", line 102, in iter_errback
yield next(it)
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/offsite.py", line 30, in process_spider_output
for x in result:
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/referer.py", line 339, in
return (_set_referer(r) for r in result or ())
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/urllength.py", line 37, in
return (r for r in result or () if _filter(r))
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/depth.py", line 58, in
return (r for r in result or () if _filter(r))
File "/Users/FiveMeter/Desktop/kaoputou-project/wenshu_monitor/Wenshu/spiders/wenshu.py", line 123, in get_docid
docid = self.js_2.call('getdocid', runeval, casewenshuid)
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/execjs/_abstract_runtime_context.py", line 37, in call
return self._call(name, *args)
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/execjs/_external_runtime.py", line 92, in _call
return self._eval("{identifier}.apply(this, {args})".format(identifier=identifier, args=args))
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/execjs/_external_runtime.py", line 78, in eval
return self.exec(code)
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/execjs/abstract_runtime_context.py", line 18, in exec
return self.exec(source)
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/execjs/_external_runtime.py", line 88, in exec
return self._extract_result(output)
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/execjs/_external_runtime.py", line 167, in _extract_result
raise ProgramError(value)
execjs._exceptions.ProgramError: Error: Malformed UTF-8 data
3、
Traceback (most recent call last):
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/utils/defer.py", line 102, in iter_errback
yield next(it)
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/offsite.py", line 30, in process_spider_output
for x in result:
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/referer.py", line 339, in
return (_set_referer(r) for r in result or ())
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/urllength.py", line 37, in
return (r for r in result or () if _filter(r))
File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/depth.py", line 58, in
return (r for r in result or () if _filter(r))
File "/Users/FiveMeter/Desktop/kaoputou-project/wenshu_monitor/Wenshu/spiders/wenshu.py", line 150, in get_detail
content_1 = json.loads(re.search(r'JSON.stringify((.*?));', html).group(1))
AttributeError: 'NoneType' object has no attribute 'group'
还有就是我爬取的筛选条件下,比如数据条数,是错误的,感觉就和随机出来的数字一样。
能解答一下吗?谢谢
不是加密的问题, 只是把html数据结构改了一下, 所以字段需要重新获取, 我把代码已经更新了, 只需要同步wenshu.py这个文件就行了:smile: @FiveMeter
感谢支持~!
大神,你好,除了以上几种错误,我每次爬了不到半个小时,还出现这种问题:
2018-12-25 14:17:36 [scrapy.extensions.logstats] INFO: Crawled 1748 pages (at 0 pages/min), scraped 928 items (at 0 items/min)
2018-12-25 14:18:36 [scrapy.extensions.logstats] INFO: Crawled 1748 pages (at 0 pages/min), scraped 928 items (at 0 items/min)
2018-12-25 14:19:36 [scrapy.extensions.logstats] INFO: Crawled 1748 pages (at 0 pages/min), scraped 928 items (at 0 items/min)
2018-12-25 14:20:36 [scrapy.extensions.logstats] INFO: Crawled 1748 pages (at 0 pages/min), scraped 928 items (at 0 items/min)
2018-12-25 14:21:36 [scrapy.extensions.logstats] INFO: Crawled 1748 pages (at 0 pages/min), scraped 928 items (at 0 items/min)
2018-12-25 14:22:36 [scrapy.extensions.logstats] INFO: Crawled 1748 pages (at 0 pages/min), scraped 928 items (at 0 items/min)
log日志里面一直显示以上内容,怎么处理啊