weibo-crawler icon indicating copy to clipboard operation
weibo-crawler copied to clipboard

2025.6.19无法使用了?请求失败,错误信息:432 Client Error:

Open Charrine opened this issue 6 months ago • 22 comments

请求失败,错误信息:432 Client Error: for url: https://m.weibo.cn/api/container/getIndex?containerid=1005057340229276。等待 10 秒后重试...

Charrine avatar Jun 19 '25 09:06 Charrine

+1 同样的问题

ronin-storm avatar Jun 19 '25 10:06 ronin-storm

+1 同样的问题

yonglinBai avatar Jun 19 '25 11:06 yonglinBai

我也是无法使用,不过我的错误信息是 Traceback (most recent call last): File "D:\Programs\Python\Python311\Lib\site-packages\requests\models.py", line 974, in json return complexjson.loads(self.text, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Programs\Python\Python311\Lib\json_init_.py", line 346, in loads return _default_decoder.decode(s) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Programs\Python\Python311\Lib\json\decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Programs\Python\Python311\Lib\json\decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "K:\weibo-crawler\weibo.py", line 1965, in get_pages if self.get_user_info() != 0: ^^^^^^^^^^^^^^^^^^^^ File "K:\weibo-crawler\weibo.py", line 369, in get_user_info js, status_code = self.get_json(params) ^^^^^^^^^^^^^^^^^^^^^ File "K:\weibo-crawler\weibo.py", line 231, in get_json return r.json(), r.status_code ^^^^^^^^ File "D:\Programs\Python\Python311\Lib\site-packages\requests\models.py", line 978, in json raise RequestsJSONDecodeError(e.msg, e.doc, e.pos) requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0) 信息抓取完毕

CCMonkeyss avatar Jun 19 '25 12:06 CCMonkeyss

我也是无法使用,不过我的错误信息是 Traceback (most recent call last): File "D:\Programs\Python\Python311\Lib\site-packages\requests\models.py", line 974, in json return complexjson.loads(self.text, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Programs\Python\Python311\Lib\json__init__.py", line 346, in loads return _default_decoder.decode(s) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Programs\Python\Python311\Lib\json\decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Programs\Python\Python311\Lib\json\decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "K:\weibo-crawler\weibo.py", line 1965, in get_pages if self.get_user_info() != 0: ^^^^^^^^^^^^^^^^^^^^ File "K:\weibo-crawler\weibo.py", line 369, in get_user_info js, status_code = self.get_json(params) ^^^^^^^^^^^^^^^^^^^^^ File "K:\weibo-crawler\weibo.py", line 231, in get_json return r.json(), r.status_code ^^^^^^^^ File "D:\Programs\Python\Python311\Lib\site-packages\requests\models.py", line 978, in json raise RequestsJSONDecodeError(e.msg, e.doc, e.pos) requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0) 信息抓取完毕

更新一下,我重新git clone之后就也是432错误了

CCMonkeyss avatar Jun 19 '25 13:06 CCMonkeyss

+1,刚刚用了发现也432了

wwwhiskey avatar Jun 19 '25 17:06 wwwhiskey

我以为就我是呢

fenglinmeng avatar Jun 20 '25 01:06 fenglinmeng

headers加上User-Agent信息正常了

wongz avatar Jun 20 '25 02:06 wongz

headers加上User-Agent信息正常了

我看“weibo.py”的第101行加了User-Agent参数,换成我自己edge浏览器上的也不行。

fenglinmeng avatar Jun 20 '25 03:06 fenglinmeng

大家找到怎么解决这个问题了吗

XMWell avatar Jun 20 '25 04:06 XMWell

我也遇到了这个问题,错误信息:432 Client Error 完整报错信息是:

请求失败,错误信息:432 Client Error: for url: https://m.weibo.cn/api/container/getIndex?containerid=1005053032210184。等待 10 秒后重试... 请求失败,错误信息:432 Client Error: for url: https://m.weibo.cn/api/container/getIndex?containerid=1005053032210184。等待 20 秒后重试...

alloevil avatar Jun 20 '25 06:06 alloevil

有大佬提PR了,等不及的可以先按PR里的改一下就可以运行了

wyslmt avatar Jun 20 '25 06:06 wyslmt

我也遇到了这个问题,错误信息:432 Client Error 完整报错信息是:

请求失败,错误信息:432 Client Error: for url: https://m.weibo.cn/api/container/getIndex?containerid=1005053032210184。等待 10 秒后重试... 请求失败,错误信息:432 Client Error: for url: https://m.weibo.cn/api/container/getIndex?containerid=1005053032210184。等待 20 秒后重试...

“weibo.py”的第101行,user_agent更新下就好了,原代码是:

user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36"

更新为以下代码就好了: user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/[120.0.0.0](http://120.0.0.0/) Safari/537.36"

alloevil avatar Jun 20 '25 06:06 alloevil

headers加上User-Agent信息正常了

我看“weibo.py”的第101行加了User-Agent参数,换成我自己edge浏览器上的也不行。

代码里User-Agent写成User_Agent了

wongz avatar Jun 20 '25 06:06 wongz

headers加上User-Agent信息正常了

我看“weibo.py”的第101行加了User-Agent参数,换成我自己edge浏览器上的也不行。

代码里User-Agent写成User_Agent了

我改回User-Agent下载大图还是403。

CCMonkeyss avatar Jun 20 '25 08:06 CCMonkeyss

我改完好像还是432

wwwhiskey avatar Jun 20 '25 08:06 wwwhiskey

感谢反馈,已经修复了。

dataabc avatar Jun 20 '25 09:06 dataabc

感谢反馈,已经修复了。

大佬你好,我更新了weibo.py(重新clone了项目),也更新了cookie, 好像能获取微博的标题,但是下载大图403失败。

即将进行原创微博图片下载 Download progress: 0%| | 0/1 [00:00<?, ?it/s][ERROR] 请求失败,错误信息:403 Client Error: Forbidden for url: https://wx4.sinaimg.cn/large/005Aea0egy1i2kpyl1jc1j33b04eonpj.jpg。尝试次数:1/3 [ERROR] 请求失败,错误信息:403 Client Error: Forbidden for url: https://wx4.sinaimg.cn/large/005Aea0egy1i2kpyl1jc1j33b04eonpj.jpg。尝试次数:2/3 [ERROR] 请求失败,错误信息:403 Client Error: Forbidden for url: https://wx4.sinaimg.cn/large/005Aea0egy1i2kpyl1jc1j33b04eonpj.jpg。尝试次数:3/3

CCMonkeyss avatar Jun 20 '25 09:06 CCMonkeyss

感谢反馈,已经修复了。

更新了weibo.py(重新clone了项目),也更新了cookie, 但是下载大图403失败。

yonglinBai avatar Jun 20 '25 09:06 yonglinBai

下载图片失败的问题,大家可以参考https://github.com/dataabc/weibo-search/issues/473,修改下代码。

dataabc avatar Jun 20 '25 11:06 dataabc

感谢🙏按照大佬链接的方法修改了download_one_file,可以下载图片和livephoto了,但是无法下载视频。 需要的朋友可以直接替换def download_one_file。

def download_one_file(self, url, file_path, type, weibo_id): """下载单个文件(图片/视频)""" try: file_exist = os.path.isfile(file_path) need_download = (not file_exist) sqlite_exist = False if "sqlite" in self.write_mode: sqlite_exist = self.sqlite_exist_file(file_path)

    if not need_download:
        return 

    s = requests.Session()
    s.mount('http://', HTTPAdapter(max_retries=5))
    s.mount('https://', HTTPAdapter(max_retries=5))
    try_count = 0
    success = False
    MAX_TRY_COUNT = 3
    detected_extension = None

    # 如果是图片链接,拼接百度下载链接
    if type == "img" and url.startswith("https://wx"):
        url = f"https://image.baidu.com/search/down?url={url}"
        logger.debug(f"[DEBUG] 使用百度下载链接: {url}")

    while try_count < MAX_TRY_COUNT:
        try:
            response = s.get(
                url, headers=self.headers, timeout=(5, 10), verify=False
            )
            response.raise_for_status()
            downloaded = response.content
            try_count += 1

            # 获取文件后缀
            url_path = url.split('?')[0]  # 去除URL中的参数
            inferred_extension = os.path.splitext(url_path)[1].lower().strip('.')

            # 通过 Magic Number 检测文件类型
            if downloaded.startswith(b'\xFF\xD8\xFF'):
                # JPEG 文件
                if not downloaded.endswith(b'\xff\xd9'):
                    logger.debug(f"[DEBUG] JPEG 文件不完整: {url} ({try_count}/{MAX_TRY_COUNT})")
                    continue  # 文件不完整,继续重试
                detected_extension = '.jpg'
            elif downloaded.startswith(b'\x89PNG\r\n\x1A\n'):
                # PNG 文件
                if not downloaded.endswith(b'IEND\xaeB`\x82'):
                    logger.debug(f"[DEBUG] PNG 文件不完整: {url} ({try_count}/{MAX_TRY_COUNT})")
                    continue  # 文件不完整,继续重试
                detected_extension = '.png'
            else:
                # 其他类型,使用原有逻辑处理
                if inferred_extension in ['mp4', 'mov', 'webm', 'gif', 'bmp', 'tiff']:
                    detected_extension = '.' + inferred_extension
                else:
                    # 尝试从 Content-Type 获取扩展名
                    content_type = response.headers.get('Content-Type', '').lower()
                    if 'image/jpeg' in content_type:
                        detected_extension = '.jpg'
                    elif 'image/png' in content_type:
                        detected_extension = '.png'
                    elif 'video/mp4' in content_type:
                        detected_extension = '.mp4'
                    elif 'video/quicktime' in content_type:
                        detected_extension = '.mov'
                    elif 'video/webm' in content_type:
                        detected_extension = '.webm'
                    elif 'image/gif' in content_type:
                        detected_extension = '.gif'
                    else:
                        # 使用原有的扩展名,如果无法确定
                        detected_extension = '.' + inferred_extension if inferred_extension else ''

            # 动态调整文件路径的扩展名
            if detected_extension:
                file_path = re.sub(r'\.\w+$', detected_extension, file_path)

            # 保存文件
            if not os.path.isfile(file_path):
                with open(file_path, "wb") as f:
                    f.write(downloaded)
                    logger.debug("[DEBUG] save " + file_path)

            success = True
            logger.debug("[DEBUG] success " + url + "  " + str(try_count))
            break  # 下载成功,退出重试循环

        except RequestException as e:
            try_count += 1
            logger.error(f"[ERROR] 请求失败,错误信息:{e}。尝试次数:{try_count}/{MAX_TRY_COUNT}")
            sleep_time = 2 ** try_count  # 指数退避
            sleep(sleep_time)
        except Exception as e:
            logger.exception(f"[ERROR] 下载过程中发生错误: {e}")
            break  # 对于其他异常,退出重试

    if success:
        if "sqlite" in self.write_mode and not sqlite_exist:
            self.insert_file_sqlite(
                file_path, weibo_id, url, downloaded
            )
    else:
        logger.debug("[DEBUG] failed " + url + " TOTALLY")
        error_file = self.get_filepath(type) + os.sep + "not_downloaded.txt"
        with open(error_file, "ab") as f:
            error_entry = f"{weibo_id}:{file_path}:{url}\n"
            f.write(error_entry.encode(sys.stdout.encoding))
except Exception as e:
    # 生成原始微博URL
    original_url = f"https://m.weibo.cn/detail/{weibo_id}"  # 新增
    error_file = self.get_filepath(type) + os.sep + "not_downloaded.txt"
    with open(error_file, "ab") as f:
        # 修改错误条目格式,添加原始URL
        error_entry = f"{weibo_id}:{file_path}:{url}:{original_url}\n"  # 修改
        f.write(error_entry.encode(sys.stdout.encoding))
    logger.exception(e)

CCMonkeyss avatar Jun 20 '25 13:06 CCMonkeyss

有大佬提PR了,等不及的可以先按PR里的改一下就可以运行了 请问这是什么意思,我改了user_agent还是不行

TY-teo avatar Jun 21 '25 17:06 TY-teo

微博101行开始:

user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36" self.headers = {"User-Agent": user_agent,"Referer": "https://m.weibo.cn/","Cookie": cookie} 我这样改一下,然后加了cookies就可以了

TY-teo avatar Jun 21 '25 18:06 TY-teo