weibo-search icon indicating copy to clipboard operation
weibo-search copied to clipboard

请问这种报错是什么原因

Open cjfw opened this issue 1 year ago • 5 comments

2025-02-14 13:45:58 [scrapy.core.scraper] ERROR: Spider error processing <GET https://s.weibo.com/weibo?q=%E6%96%B0%E8%83%BD%E6%BA%90%E6%B1%BD%E8%BD%A6%E4%BA%A7%E4%B8%9A%E5%8F%91%E 5%B1%95&typeall=1&suball=1&timescope=custom:2014-03-13-0:2014-03-14-0&page=3> (referer: https://s.weibo.com/weibo?q=%E6%96%B0%E8%83%BD%E6%BA%90%E6%B1%BD%E8%BD%A6%E4%BA%A7%E4%B8%9A%E5%8F%91%E5%B1%95&typeall=1&suball=1&timescope=custom:2014-03-13-0:2014-03-14-0&page=2) urllib3.exceptions.SSLError: [SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1000)

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "I:\pycharm社区免费版\venv\Lib\site-packages\requests\adapters.py", line 667, in send resp = conn.urlopen( ^^^^^^^^^^^^^ File "I:\pycharm社区免费版\venv\Lib\site-packages\urllib3\connectionpool.py", line 841, in urlopen retries = retries.increment( ^^^^^^^^^^^^^^^^^^ File "I:\pycharm社区免费版\venv\Lib\site-packages\urllib3\util\retry.py", line 519, in increment raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='weibo.com', port=443): Max retries exceeded with url: /ajax/statuses/show?id=AAJ8QqtnF&locale=zh-CN (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1000)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "I:\pycharm社区免费版\venv\Lib\site-packages\scrapy\utils\defer.py", line 327, in iter_errback yield next(it) ^^^^^^^^ File "I:\pycharm社区免费版\venv\Lib\site-packages\scrapy\utils\python.py", line 368, in next return next(self.data) ^^^^^^^^^^^^^^^ File "I:\pycharm社区免费版\venv\Lib\site-packages\scrapy\utils\python.py", line 368, in next return next(self.data) ^^^^^^^^^^^^^^^ File "I:\pycharm社区免费版\venv\Lib\site-packages\scrapy\core\spidermw.py", line 106, in process_sync yield from iterable File "I:\pycharm社区免费版\venv\Lib\site-packages\scrapy\spidermiddlewares\referer.py", line 379, in return (self._set_referer(r, response) for r in result) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "I:\pycharm社区免费版\venv\Lib\site-packages\scrapy\core\spidermw.py", line 106, in process_sync yield from iterable File "I:\pycharm社区免费版\venv\Lib\site-packages\scrapy\spidermiddlewares\urllength.py", line 57, in return (r for r in result if self._filter(r, spider)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "I:\pycharm社区免费版\venv\Lib\site-packages\scrapy\core\spidermw.py", line 106, in process_sync yield from iterable File "I:\pycharm社区免费版\venv\Lib\site-packages\scrapy\spidermiddlewares\depth.py", line 54, in return (r for r in result if self._filter(r, response, spider)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "I:\pycharm社区免费版\venv\Lib\site-packages\scrapy\core\spidermw.py", line 106, in process_sync yield from iterable File "E:\Study\weibo-search-master\weibo-search-master\weibo\spiders\search.py", line 278, in parse_page for weibo in self.parse_weibo(response): File "E:\Study\weibo-search-master\weibo-search-master\weibo\spiders\search.py", line 537, in parse_weibo weibo["ip"] = self.get_ip(bid) ^^^^^^^^^^^^^^^^ File "E:\Study\weibo-search-master\weibo-search-master\weibo\spiders\search.py", line 291, in get_ip response = requests.get(url, headers=self.settings.get('DEFAULT_REQUEST_HEADERS')) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "I:\pycharm社区免费版\venv\Lib\site-packages\requests\api.py", line 73, in get return request("get", url, params=params, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "I:\pycharm社区免费版\venv\Lib\site-packages\requests\api.py", line 59, in request return session.request(method=method, url=url, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "I:\pycharm社区免费版\venv\Lib\site-packages\requests\sessions.py", line 589, in request resp = self.send(prep, **send_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "I:\pycharm社区免费版\venv\Lib\site-packages\requests\sessions.py", line 703, in send r = adapter.send(request, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "I:\pycharm社区免费版\venv\Lib\site-packages\requests\adapters.py", line 698, in send raise SSLError(e, request=request) requests.exceptions.SSLError: HTTPSConnectionPool(host='weibo.com', port=443): Max retries exceeded with url: /ajax/statuses/show?id=AAJ8QqtnF&locale=zh-CN (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1000)')))

cjfw avatar Feb 14 '25 05:02 cjfw

可能是ssl问题,参考下HTTPSConnectionPool:URL 超出最大重试次数(由 SSLError 导致) 它就是有的帖子会出现这个错误有的又是正常爬取的是怎么回事

cjfw avatar Feb 14 '25 07:02 cjfw

可能是网络原因,不确定。

dataabc avatar Feb 14 '25 11:02 dataabc

可能是网络原因,不确定。

我也是这种情况,一挂梯子就出现,不挂就没事,请问有没有解决办法

Inexa1 avatar Dec 18 '25 05:12 Inexa1

@Inexa1 使用时尽量不要用梯子。

dataabc avatar Dec 18 '25 08:12 dataabc