设定开始日期和结束日期都是7.17的话,会爬到7.18,7.17的微博,看了一下好像大家出现这样问题的情况不多,想请问一下是什么原因呢?
好嘞 谢谢作者!还想请问一下运行一段时间后出现这个报错是什么原因哇,没有找到类似的问题!
2024-08-01 07:45:51 [scrapy.core.scraper] ERROR: Spider error processing <GET https://s.weibo.com/weibo?q=%E6%9A%B4%E9%9B%A8&typeall=1&suball=1×cope=custom:2024-07-18-6:2024-07-18-7&page=1> (referer: https://s.weibo.com/weibo?q=%E6%9A%B4%E9%9B%A8&typeall=1&suball=1×cope=custom:2024-07-18-0:2024-07-19-0&page=1)
urllib3.exceptions.SSLError: [SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1006)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\lmy\anaconda3\Lib\site-packages\requests\adapters.py", line 486, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File "C:\Users\lmy\anaconda3\Lib\site-packages\urllib3\connectionpool.py", line 845, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "C:\Users\lmy\anaconda3\Lib\site-packages\urllib3\util\retry.py", line 515, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='weibo.com', port=443): Max retries exceeded with url: /ajax/statuses/show?id=Oo4KYzuaR&locale=zh-CN (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1006)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\lmy\anaconda3\Lib\site-packages\scrapy\utils\defer.py", line 279, in iter_errback
yield next(it)
^^^^^^^^
File "C:\Users\lmy\anaconda3\Lib\site-packages\scrapy\utils\python.py", line 350, in next
return next(self.data)
^^^^^^^^^^^^^^^
File "C:\Users\lmy\anaconda3\Lib\site-packages\scrapy\utils\python.py", line 350, in next
return next(self.data)
^^^^^^^^^^^^^^^
File "C:\Users\lmy\anaconda3\Lib\site-packages\scrapy\core\spidermw.py", line 106, in process_sync
for r in iterable:
File "C:\Users\lmy\anaconda3\Lib\site-packages\scrapy\spidermiddlewares\referer.py", line 352, in
return (self._set_referer(r, response) for r in result or ())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\lmy\anaconda3\Lib\site-packages\scrapy\core\spidermw.py", line 106, in process_sync
for r in iterable:
File "C:\Users\lmy\anaconda3\Lib\site-packages\scrapy\spidermiddlewares\urllength.py", line 27, in
return (r for r in result or () if self._filter(r, spider))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\lmy\anaconda3\Lib\site-packages\scrapy\core\spidermw.py", line 106, in process_sync
for r in iterable:
File "C:\Users\lmy\anaconda3\Lib\site-packages\scrapy\spidermiddlewares\depth.py", line 31, in
return (r for r in result or () if self._filter(r, response, spider))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\lmy\anaconda3\Lib\site-packages\scrapy\core\spidermw.py", line 106, in process_sync
for r in iterable:
File "E:\flood\weibo-search-master-修改\weibo-search-master - 副本\weibo\spiders\search.py", line 197, in parse_by_hour
for weibo in self.parse_weibo(response):
File "E:\flood\weibo-search-master-修改\weibo-search-master - 副本\weibo\spiders\search.py", line 517, in parse_weibo
weibo["ip"] = self.get_ip(bid)
^^^^^^^^^^^^^^^^
File "E:\flood\weibo-search-master-修改\weibo-search-master - 副本\weibo\spiders\search.py", line 271, in get_ip
response = requests.get(url, headers=self.settings.get('DEFAULT_REQUEST_HEADERS'))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\lmy\anaconda3\Lib\site-packages\requests\api.py", line 73, in get
return request("get", url, params=params, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\lmy\anaconda3\Lib\site-packages\requests\api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\lmy\anaconda3\Lib\site-packages\requests\sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\lmy\anaconda3\Lib\site-packages\requests\sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\lmy\anaconda3\Lib\site-packages\requests\adapters.py", line 517, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='weibo.com', port=443): Max retries exceeded with url: /ajax/statuses/show?id=Oo4KYzuaR&locale=zh-CN (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1006)')))
可能微博接口就是如此输出的。