Duix.Heygem icon indicating copy to clipboard operation
Duix.Heygem copied to clipboard

5090.yml文件拉取镜像报错

Open xiaotang-12-ops opened this issue 7 months ago • 8 comments

报错如下: Error response from daemon: Get "https://registry-1.docker.io/v2/guiji2025/fish-speech-5090/manifests/sha256:ec9aabf14419d10f3823f8a73bc6ded71cb6d112833018965dd618a88a3c9f85": EOF

我之前拉取过一次这个镜像是没问题的,后来我发现声音很沙哑然后我把heygem的所有的镜像都删掉了客户端也卸载了,代码重新git clone然后拉取5090.yml文件镜像的时候出现上面问题,有人知道怎么回事吗?

Image

xiaotang-12-ops avatar May 15 '25 01:05 xiaotang-12-ops

The last error is EOF, right? Usually, the EOF OF when Docker pulls images is a network issue. You can try again. It should be able to solve it

LegendaryM avatar May 15 '25 06:05 LegendaryM

最后一个错误是 EOF,对吧?Docker 拉取镜像时出现 EOF 通常是网络问题。你可以再试一次,应该能解决。

谢谢你,确实是这个问题,我的5090电脑训练出来的数字人视频声音依旧是很沙哑的...我注意到5090配置文件只有两个服务,会和这个有关系吗

xiaotang-12-ops avatar May 15 '25 07:05 xiaotang-12-ops

最后一个错误是 EOF,对吧?Docker 拉取镜像时出现 EOF 通常是网络问题。你可以再试一次,应该能解决。

谢谢你,确实是这个问题,我的5090电脑训练出来的数字人视频声音依旧是很沙哑的...我注意到5090配置文件只有两个服务,会和这个有关系吗

Could you please provide the problematic audio and video so that our developers can locate them in detail

LegendaryM avatar May 15 '25 08:05 LegendaryM

最后一个错误是 EOF,对吧?Docker 拉取镜像时出现 EOF 通常是网络问题。你可以再试一次,应该能解决。

谢谢你,确实是这个问题,我的5090电脑训练出来的数字人视频声音依旧是很沙哑的...我注意到5090配置文件只有两个服务,会和这个有关系吗

Could you please provide the problematic audio and video so that our developers can locate them in detail

It seems that there is no way to upload long videos on GitHub ..

xiaotang-12-ops avatar May 15 '25 08:05 xiaotang-12-ops

最后一个错误是 EOF,对吧?Docker 拉取镜像时出现 EOF 通常是网络问题。你可以再试一次,应该能解决。

谢谢你,确实是这个问题,我的5090电脑训练出来的数字人视频声音依旧是很沙哑的...我注意到5090配置文件只有两个服务,会和这个有关系吗

Could you please provide the problematic audio and video so that our developers can locate them in detail

Can you download the video by visiting this link? https://raw.githubusercontent.com/xiaotang-12-ops/my-videos/main/cbdd8806433f9f3bc87e25cb3524bb6c.mp4

xiaotang-12-ops avatar May 15 '25 09:05 xiaotang-12-ops

https://raw.githubusercontent.com/xiaotang-12-ops/my-videos/main/cbdd8806433f9f3bc87e25cb3524bb6c.mp4

Ok. I can download it normally here. Thank you for providing the materials. The developers have been informed and I believe it will be fixed soon

LegendaryM avatar May 16 '25 02:05 LegendaryM

我的是从2080显卡今天换成5090d显卡后,这个数字人没办法用了,提示cuda内核错误,怎么解决?你是怎么跑起来的? `========== == CUDA ==

CUDA Version 12.1.1 Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. This container image and its contents are governed by the NVIDIA Deep Learning Container License. By pulling and using the container, you accept the terms and conditions of this license: https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience. Matplotlib is building the font cache; this may take a moment. taskset: bad usage Try 'taskset --help' for more information. INFO:gjtts_server:加载自定义 姓名多音字 [tools/text_norm/front_end/utils/name_polyphone.json] INFO: Started server process [1] INFO: Waiting for application startup. DEBUG:gjtts_server:语言类型 CN_EN DEBUG:gjtts_server:加载自定义 单位 [/code/tools/text_norm/front_end/normalize/config/units.json] DEBUG:gjtts_server:加载自定义 单位 [/code/tools/text_norm/front_end/normalize/config/units.json] DEBUG:gjtts_server:加载自定义 单位 [/code/tools/text_norm/front_end/normalize/config/units.json] 2025-05-21 10:20:12.909 | INFO | tools.llama.generate:load_model:682 - Restored model from checkpoint 2025-05-21 10:20:12.910 | INFO | tools.llama.generate:load_model:688 - Using DualARTransformer Exception in thread Thread-2 (worker): Traceback (most recent call last): File "/opt/conda/envs/python310/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/opt/conda/envs/python310/lib/python3.10/threading.py", line 953, in run self._target(*self._args, **self._kwargs) File "/code/tools/llama/generate.py", line 916, in worker model.setup_caches( File "/code/fish_speech/models/text2semantic/llama.py", line 575, in setup_caches super().setup_caches(max_batch_size, max_seq_len, dtype) File "/code/fish_speech/models/text2semantic/llama.py", line 241, in setup_caches b.attention.kv_cache = KVCache( File "/code/fish_speech/models/text2semantic/llama.py", line 139, in init self.register_buffer("k_cache", torch.zeros(cache_shape, dtype=dtype)) File "/opt/conda/envs/python310/lib/python3.10/site-packages/torch/utils/_device.py", line 78, in torch_function return func(*args, **kwargs) RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.`

simplify123 avatar May 21 '25 10:05 simplify123

我是从2080显卡今天换成5090d显卡后,这个数字人没办法用了,提示cuda内核错误,怎么解决?你是怎么跑起来的?

========== == CUDA == CUDA 版本 12.1.1 容器镜像版权所有 (c) 2016-2023,NVIDIA CORPORATION & AFFILIATES。保留所有权利。 此容器镜像及其内容受 NVIDIA 深度学习容器许可证管辖。 拉取和使用容器即表示您接受此许可证的条款和条件: https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license 为方便您使用,此容器中提供了此许可证的副本,网址为 /NGC-DL-CONTAINER-LICENSE。Matplotlib 正在构建字体缓存;这可能需要一些时间。taskset :使用不当 请尝试“taskset --help”获取更多信息。INFO :gjtts_server:加载自定义姓名多音字 [tools/text_norm/front_end/utils/name_polyphone.json] INFO: 已启动服务器进程 [1] INFO: 正在等待应用程序启动。 DEBUG:gjtts_server:语言类型 CN_EN DEBUG:gjtts_server:加载自定义单位 [/code/tools/text_norm/front_end/normalize/config/units.json] DEBUG:gjtts_server:加载自定义单位 [/code/tools/text_norm/front_end/normalize/config/units.json] DEBUG:gjtts_server:加载自定义单位[/code/tools/text_norm/front_end/normalize/config/units.json] 2025-05-21 10:20:12.909 |信息| tools.llama.generate:load_model:682 - 从检查点恢复模型 2025-05-21 10:20:12.910 |信息| tools.llama.generate:load_model:688 - 使用 DualARTransformer 线程 Thread-2(worker)中的异常 :回溯(最近一次调用最后一次): 文件“/opt/conda/envs/python310/lib/python3.10/threading.py”,第 1016 行,在 _bootstrap_inner self.run() 文件“/opt/conda/envs/python310/lib/python3.10/threading.py”,第 953 行,在运行中 self._target(*self._args, **self._kwargs) 文件“/code/tools/llama/generate.py”,第 916 行,在 worker model.setup_caches( 文件“/code/fish_speech/models/text2semantic/llama.py”,第 575 行,在 setup_caches super().setup_caches(max_batch_size, max_seq_len, dtype) 文件“/code/fish_speech/models/text2semantic/llama.py”,第 241 行,在 setup_caches 中 b.attention.kv_cache = KVCache( 文件“/code/fish_speech/models/text2semantic/llama.py”,第 139 行,在**init**中 self.register_buffer("k_cache", torch.zeros(cache_shape, dtype=dtype)) 文件“/opt/conda/envs/python310/lib/python3.10/site-packages/torch/utils/_device.py”,第 78 行,在**torch_function**中 return func(*args, **kwargs) RuntimeError: CUDA 错误:没有可在设备上执行的内核映像 CUDA 内核错误可能会在其他一些 API 调用中异步报告,因此下面的堆栈跟踪可能不正确。 为了进行调试,请考虑传递 CUDA_LAUNCH_BLOCKING=1。 使用以下方式编译TORCH_USE_CUDA_DSA`启用设备端断言。

好奇怪,我记得我回你了,怎么这里看不到记录,我都还看到你的回复了来着

xiaotang-12-ops avatar May 22 '25 06:05 xiaotang-12-ops