Wenshu_Spider icon indicating copy to clipboard operation
Wenshu_Spider copied to clipboard

加密好像改了,现在无法获取docid了

Open FiveMeter opened this issue 5 years ago • 3 comments

有几种报错,我不太明白,您能帮我看一下吗? 1、 Traceback (most recent call last): File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/utils/defer.py", line 102, in iter_errback yield next(it) File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/offsite.py", line 30, in process_spider_output for x in result: File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/referer.py", line 339, in return (_set_referer(r) for r in result or ()) File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/urllength.py", line 37, in return (r for r in result or () if _filter(r)) File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/depth.py", line 58, in return (r for r in result or () if _filter(r)) File "/Users/FiveMeter/Desktop/kaoputou-project/wenshu_monitor/Wenshu/spiders/wenshu.py", line 118, in get_docid result = eval(json.loads(html)) File "", line 1 ["RunEval":"w61aSW7Cg0AQfAvClg/Cg8KIw7IBw6TCk8KfwpDDowhZFnZiDjHDkcKYwpwsw789QCwCw4wCwoTCrRlKQmUxa3V3dQ8gby/DkcOpfAtFw7TClcOsw54SEV0/XsOfRcO8wrnCvxzDhT4+wp3CmcOneDwAwpDChhc4AcKwDsKFw5hgw4hKw5IVVQnCsQXDgMOvw7AsAMORwoPDhxBcw7gTwo7CgcOBwoDDlUcAwrrCgyvDoUXDmA9AaBA4AMKiAj/DgTjCmATCgBBswrYGADkBwqApAAA6wrPDnC4AVXB9w7/CsMObwoTDscO1wpbCiMOvMMKJw4XDhj/DsEPCkF7CjHnDt0c2wqzCuMOuD8KXw7/DjsOkKQTCgcOHwpzCvMODw6XDlTXCs8KOw6bCnsO8dsO7w7fCl2vDjsKyw6Z0BHfCsh4ew7DDtMKnwrZhdcK4w7PCnMKgw5/CrMOGWznDlsOIwqnCvMKawrrCvcOYKmctwrtIwqLCoMK0w4HCsHZBwrLCnUdLwrfCjcK0W8KOw6gywqzDs8OYw79Nw6gxwqvDr8OUQcOmD8K3SFUecsKmw6YXwqslw71wOw/DqcKNUsKuwpjDtMOdTcOeM0VEYVfCpcKiwqXCrUV5NcOWHcKSNk3CtXrCjy3CmkrDryhUJ8OdR2PCpsOqwqwcwpnDhMO0ZmscUG7DlcKdwpdTwolAOsKIw5U8Ryk4w7XCnU3DjwTDj8OHwq7CpcOJZSbCuTXDrz0nwrHCgQnDkD5dF3DDtsOTYGEsCFQMb2nCvcKqwq7CqsOrbMO9ccKrw5paXsKQw7d4RSPDpDzDsMORw6teLyLCg8K8wqzDukBBSxbCusOUwpVywrvDvTfCqMK+dlp0wpzCkMKMNjVmw6ZEwpYrwocpwo8nPhtQw6TCu0Z6w4zCj0ZrdMOpw6wcwpfDoMKfZsKiFsK9K8KOwrswwrnCisOnMsOXw78B",] ^ SyntaxError: invalid syntax

2、 Traceback (most recent call last): File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/utils/defer.py", line 102, in iter_errback yield next(it) File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/offsite.py", line 30, in process_spider_output for x in result: File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/referer.py", line 339, in return (_set_referer(r) for r in result or ()) File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/urllength.py", line 37, in return (r for r in result or () if _filter(r)) File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/depth.py", line 58, in return (r for r in result or () if _filter(r)) File "/Users/FiveMeter/Desktop/kaoputou-project/wenshu_monitor/Wenshu/spiders/wenshu.py", line 123, in get_docid docid = self.js_2.call('getdocid', runeval, casewenshuid) File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/execjs/_abstract_runtime_context.py", line 37, in call return self._call(name, *args) File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/execjs/_external_runtime.py", line 92, in _call return self._eval("{identifier}.apply(this, {args})".format(identifier=identifier, args=args)) File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/execjs/_external_runtime.py", line 78, in eval return self.exec(code) File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/execjs/abstract_runtime_context.py", line 18, in exec return self.exec(source) File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/execjs/_external_runtime.py", line 88, in exec return self._extract_result(output) File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/execjs/_external_runtime.py", line 167, in _extract_result raise ProgramError(value) execjs._exceptions.ProgramError: Error: Malformed UTF-8 data

3、 Traceback (most recent call last): File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/utils/defer.py", line 102, in iter_errback yield next(it) File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/offsite.py", line 30, in process_spider_output for x in result: File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/referer.py", line 339, in return (_set_referer(r) for r in result or ()) File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/urllength.py", line 37, in return (r for r in result or () if _filter(r)) File "/Users/FiveMeter/Desktop/kaoputou-project/venv/wenshu-venv/lib/python3.6/site-packages/scrapy/spidermiddlewares/depth.py", line 58, in return (r for r in result or () if _filter(r)) File "/Users/FiveMeter/Desktop/kaoputou-project/wenshu_monitor/Wenshu/spiders/wenshu.py", line 150, in get_detail content_1 = json.loads(re.search(r'JSON.stringify((.*?));', html).group(1)) AttributeError: 'NoneType' object has no attribute 'group'

还有就是我爬取的筛选条件下,比如数据条数,是错误的,感觉就和随机出来的数字一样。 能解答一下吗?谢谢

FiveMeter avatar Dec 24 '18 04:12 FiveMeter

不是加密的问题, 只是把html数据结构改了一下, 所以字段需要重新获取, 我把代码已经更新了, 只需要同步wenshu.py这个文件就行了:smile: @FiveMeter 感谢支持~!

Henryhaohao avatar Dec 24 '18 06:12 Henryhaohao

大神,你好,除了以上几种错误,我每次爬了不到半个小时,还出现这种问题: 2018-12-25 14:17:36 [scrapy.extensions.logstats] INFO: Crawled 1748 pages (at 0 pages/min), scraped 928 items (at 0 items/min) 2018-12-25 14:18:36 [scrapy.extensions.logstats] INFO: Crawled 1748 pages (at 0 pages/min), scraped 928 items (at 0 items/min) 2018-12-25 14:19:36 [scrapy.extensions.logstats] INFO: Crawled 1748 pages (at 0 pages/min), scraped 928 items (at 0 items/min) 2018-12-25 14:20:36 [scrapy.extensions.logstats] INFO: Crawled 1748 pages (at 0 pages/min), scraped 928 items (at 0 items/min) 2018-12-25 14:21:36 [scrapy.extensions.logstats] INFO: Crawled 1748 pages (at 0 pages/min), scraped 928 items (at 0 items/min) 2018-12-25 14:22:36 [scrapy.extensions.logstats] INFO: Crawled 1748 pages (at 0 pages/min), scraped 928 items (at 0 items/min) log日志里面一直显示以上内容,怎么处理啊

kingshrimp avatar Dec 25 '18 06:12 kingshrimp

你好,请问解决了吗?望回复

wjhwangjinhui avatar Jan 11 '19 09:01 wjhwangjinhui