weibo-crawler issues

请教运行在vps时如何自定义下载路径？

2

小硬盘机器挂载了阿里云盘想更改下载路径到挂载的硬盘中如果修改不了有什么其他变通方式吗

安装依赖没装完，不知道哪里有问题，研究了好久都没弄懂，我该做什么？这是控制台提示 C:\Users\delic\Documents\GitHub\weibo-crawler>pip install -r requirements.txt Collecting lxml==4.9.1 (from -r requirements.txt (line 1)) Using cached lxml-4.9.1.tar.gz (3.4 MB) Preparing metadata (setup.py) ... done Requirement already satisfied: pymongo==3.7.2 in c:\python311\lib\site-packages (from...

oliverance

random.shuffle(user_id_list) 这个修改，如果 user_id_list 是文件路径的话，会导致报错

4

如题

chjuheng

安装依赖出现问题[Errno 2] No such file or directory: 'requirements.txt'

3

weibo-crawler文件夹已经出现了，这个文件夹里面也有requirements.txt的文件，但运行pip install命令的时候出现了ERROR: Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'。我知道问题似乎出在没有在同一个目录上执行命令，但不知道该怎么解决，能否请大佬指点一下。用的jupyter notebook

kadima221

ERROR: Could not build wheels for lxml, which is required to install pyproject.toml-based projects

1

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for lxml Failed to build lxml ERROR: Could not build wheels...

Lixiaoxing666

定期爬取的时间问题

1

有些微博发了几个小时就删，如果我设定每十分钟爬一次，程序会不会自动判断新微博的时间，还是说依旧按照since_date来计算举个例子 6月20号早上8点用户发了一条，我在8点30分的时候开始爬取，之后每隔十分钟就运行一次程序。到了9点的时候用户又发了一条，这条是否可以被爬取到，还是要到6月21号凌晨的0点才能被识别再爬取

Arukassss

爬取设置仅半年可见的微博用户

1

程序中计算用户pages的算法为微博数/10。如果该用户5年每年发了2000条微博，且设置仅半年可见。按算法page number = 10000/10 = 1000页。但由于仅半年可见，实际页面数仅有100页。程序爬到100页后会自动尝试爬后续900页，造成时间浪费。建议可以检查每次爬取页面的内容，若连续为空页面，则爬取下一个用户。另外还有个小问题，就是有时候爬着爬着就卡在那，不报错也不动。这个是微博限制导致的正常情况吗？感谢

HongzhangXie

报错UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd3 in position 0: invalid continuation byte

5

twilight1024

string indices must be integers, not 'str'

5

在下载蛇界猛女的图片的时候会有以下报错 string indices must be integers, not 'str' Traceback (most recent call last): File "E:\weibo-crawler-master\weibo.py", line 854, in get_one_weibo weibo = self.parse_weibo(weibo_info) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\weibo-crawler-master\weibo.py", line 753, in parse_weibo weibo["pics"]...

JiangYue2003

python3版本爬去到20页就报错

1

TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

Lirsakura

weibo-crawler
weibo-crawler copied to clipboard

Metadata

请教运行在vps时如何自定义下载路径？

安装依赖时的问题

random.shuffle(user_id_list) 这个修改，如果 user_id_list 是文件路径的话，会导致报错

安装依赖出现问题[Errno 2] No such file or directory: 'requirements.txt'

ERROR: Could not build wheels for lxml, which is required to install pyproject.toml-based projects

定期爬取的时间问题

爬取设置仅半年可见的微博用户

报错UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd3 in position 0: invalid continuation byte

string indices must be integers, not 'str'

python3版本爬去到20页就报错

← Metadata

Owner

Metadata

weibo-crawler weibo-crawler copied to clipboard

Metadata

← Metadata

Owner

Metadata

weibo-crawler
weibo-crawler copied to clipboard