weibo-crawler
weibo-crawler copied to clipboard
无法安装依赖
我是MacOS High Sierra 10.13.6,但无法安装依赖,显示:
chbdeMBP:weibo-crawler-master Mark$ pip install -r requirements.txt
Traceback (most recent call last):
File "/usr/local/bin/pip", line 9, in
我尝试直接运行weibo.py但又显示:
chbdeMBP:weibo-crawler-master Mark$ python weibo.py
Traceback (most recent call last):
File "weibo.py", line 19, in
请问这个怎么解决啊?
换成
$ pip3 install -r requirements.txt
感谢大佬!这个命令确实安装成功了:
Successfully installed chardet-3.0.4 idna-2.8 lxml-4.6.3 requests-2.22.0 tqdm-4.32.2 urllib3-1.25.11
但是当我运行weibo.py的时候还是显示:
chbdeMBP:weibo-crawler-master Mark$ python weibo.py
Traceback (most recent call last):
File "weibo.py", line 19, in
因为你使用的是python2,安装的是python3的依赖,换成
$ python3 weibo.py
感谢!然而我运行的时候又碰到了问题还得求助大佬,我把config.json文件改成如下(cookie已隐去,但运行时已正确填入):
{ "user_id_list": ["1663072851"], "filter": 1, "since_date": "2020-03-11", "query_list": ["新冠"], "start_page": 1, "write_mode": ["csv"], "original_pic_download": 0, "retweet_pic_download": 0, "original_video_download": 0, "retweet_video_download": 0, "result_dir_name": 0, "cookie": "", "mysql_config": { "host": "localhost", "port": 3306, "user": "root", "password": "123456", "charset": "utf8mb4" } }
然后确实能够爬取我想要的微博了,但是我设定要爬到2020-3-11,它却只爬到2021-3-12就停止了。当我再次启动的时候,它就显示:
chbdeMBP:weibo-crawler-master Mark$ python3 weibo.py ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 用户信息 'id' Traceback (most recent call last): File "/Users/Mark/Desktop/weibo-crawler-master/weibo.py", line 1079, in get_pages self.print_user_info() File "/Users/Mark/Desktop/weibo-crawler-master/weibo.py", line 590, in print_user_info logger.info(u'用户id:%s', self.user['id']) KeyError: 'id' 信息抓取完毕
请问这是什么情况呢?
应该是速度太快,被暂时限制了,限制一段时间会自动解除,尽量放慢速度,增大sleep的值,以免被限制。
请问怎么增大这个sleep的值呢?
修改get_pages方法
def get_pages(self):
"""获取全部微博"""
...
if (page -
page1) % random_pages == 0 and page < page_count:
sleep(random.randint(6, 10))
page1 = page
random_pages = random.randint(1, 5)
...
增大sleep的数字,它代表暂停的时间,或者把random.randint的5改成更小的正整数,它代表每爬多少页暂停。
yangfan@FanFanMacBookPro weibo-crawler % python weibo.py
Traceback (most recent call last):
File "weibo.py", line 22, in
Traceback (most recent call last):
File "weibo.py", line 22, in
MacOS 用了python pyhton3 启动weibo.py 都是显示报错 No module named 'requests'
@akafanfan 需要安装requests包