dreamtalk icon indicating copy to clipboard operation
dreamtalk copied to clipboard

CUDA 12.3,安装后运行不了

Open chaorenai opened this issue 1 year ago • 5 comments

我的 CUDA 12.3,安装后运行不了。我在https://pytorch.org/安装了适配的conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia项目依然运行不了。需要升级python嘛?我按照介绍里的安装的: conda create -n dreamtalk python=3.7.0 conda activate dreamtalk pip install -r requirements.txt conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge conda update ffmpeg

pip install urllib3==1.26.6 pip install transformers==4.28.1 pip install dlib

但是运行不来啊。

(dreamtalk) C:\Users\sunny\Documents\dreamtalk>python inference_for_demo_video.py ^ More? --wav_path data/audio/acknowledgement_english.m4a ^ More? --style_clip_path data/style_clip/3DMM/M030_front_neutral_level1_001.mat ^ More? --pose_path data/pose/RichardShelby_front_neutral_level1_001.mat ^ More? --image_path data/src_img/uncropped/male_face.png ^ More? --cfg_scale 1.0 ^ More? --max_gen_len 30 ^ More? --output_name acknowledgement_english@M030_front_neutral_level1_001@male_face Traceback (most recent call last): File "inference_for_demo_video.py", line 20, in from generators.utils import get_netG, render_video File "C:\Users\sunny\Documents\dreamtalk\generators\utils.py", line 8, in import torchvision File "C:\Users\sunny.conda\envs\dreamtalk\lib\site-packages\torchvision_init_.py", line 5, in from torchvision import datasets, io, models, ops, transforms, utils File "C:\Users\sunny.conda\envs\dreamtalk\lib\site-packages\torchvision\models_init_.py", line 16, in from .maxvit import * File "C:\Users\sunny.conda\envs\dreamtalk\lib\site-packages\torchvision\models\maxvit.py", line 3, in from typing import Any, Callable, List, Optional, OrderedDict, Sequence, Tuple ImportError: cannot import name 'OrderedDict' from 'typing' (C:\Users\sunny.conda\envs\dreamtalk\lib\typing.py)

(dreamtalk) C:\Users\sunny\Documents\dreamtalk>

chaorenai avatar Jan 04 '24 06:01 chaorenai

您好. 您的错误应该是python版本的问题. 可以尝试更高版本 比如3.8 或 3.9 等等( 根据https://stackoverflow.com/questions/75529492/importerror-cannot-import-name-ordereddict-from-typing 只要3.7.2以上即可) 您这个错误提示是python type hint的问题, 和cuda版本等无关, 和python版本有关.

YifengMa9 avatar Jan 04 '24 06:01 YifengMa9

我升级python到3.10.6以后,推理的时候,又遇到这样的错误:

(dreamtalk) C:\Users\sunny\Documents\dreamtalk>python inference_for_demo_video.py ^ More? --wav_path data/audio/acknowledgement_english.m4a ^ More? --style_clip_path data/style_clip/3DMM/M030_front_neutral_level1_001.mat ^ More? --pose_path data/pose/RichardShelby_front_neutral_level1_001.mat ^ More? --image_path data/src_img/uncropped/male_face.png ^ More? --cfg_scale 1.0 ^ More? --max_gen_len 30 ^ More? --output_name acknowledgement_english@M030_front_neutral_level1_001@male_face ffmpeg version N-112686-g3f890fbfd9-20231104 Copyright (c) 2000-2023 the FFmpeg developers built with gcc 13.2.0 (crosstool-NG 1.25.0.232_c175b21) configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static --pkg-config=pkg-config --cross-prefix=x86_64-w64-mingw32- --arch=x86_64 --target-os=mingw32 --enable-gpl --enable-version3 --disable-debug --enable-shared --disable-static --disable-w32threads --enable-pthreads --enable-iconv --enable-libxml2 --enable-zlib --enable-libfreetype --enable-libfribidi --enable-gmp --enable-lzma --enable-fontconfig --enable-libharfbuzz --enable-libvorbis --enable-opencl --disable-libpulse --enable-libvmaf --disable-libxcb --disable-xlib --enable-amf --enable-libaom --enable-libaribb24 --enable-avisynth --enable-chromaprint --enable-libdav1d --enable-libdavs2 --disable-libfdk-aac --enable-ffnvcodec --enable-cuda-llvm --enable-frei0r --enable-libgme --enable-libkvazaar --enable-libass --enable-libbluray --enable-libjxl --enable-libmp3lame --enable-libopus --enable-librist --enable-libssh --enable-libtheora --enable-libvpx --enable-libwebp --enable-lv2 --enable-libvpl --enable-openal --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopenmpt --enable-librav1e --enable-librubberband --enable-schannel --enable-sdl2 --enable-libsoxr --enable-libsrt --enable-libsvtav1 --enable-libtwolame --enable-libuavs3d --disable-libdrm --enable-vaapi --enable-libvidstab --enable-vulkan --enable-libshaderc --enable-libplacebo --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libzimg --enable-libzvbi --extra-cflags=-DLIBTWOLAME_STATIC --extra-cxxflags= --extra-ldflags=-pthread --extra-ldexeflags= --extra-libs=-lgomp --extra-version=20231104 libavutil 58. 31.100 / 58. 31.100 libavcodec 60. 32.102 / 60. 32.102 libavformat 60. 17.100 / 60. 17.100 libavdevice 60. 4.100 / 60. 4.100 libavfilter 9. 13.100 / 9. 13.100 libswscale 7. 6.100 / 7. 6.100 libswresample 4. 13.100 / 4. 13.100 libpostproc 57. 4.100 / 57. 4.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'data/audio/acknowledgement_english.m4a': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A isommp42 creation_time : 2023-12-20T14:25:20.000000Z iTunSMPB : 00000000 00000840 00000000 00000000000C23C0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 Duration: 00:00:16.62, start: 0.044000, bitrate: 246 kb/s Stream #0:00x1: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, mono, fltp, 244 kb/s (default) Metadata: creation_time : 2023-12-20T14:25:20.000000Z handler_name : Core Media Audio vendor_id : [0][0][0][0] Stream mapping: Stream #0:0 -> #0:0 (aac (native) -> pcm_s16le (native)) Press [q] to stop, [?] for help Output #0, wav, to 'tmp/acknowledgement_english@M030_front_neutral_level1_001@male_face\acknowledgement_english@M030_front_neutral_level1_001@male_face_16K.wav': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A isommp42 iTunSMPB : 00000000 00000840 00000000 00000000000C23C0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ISFT : Lavf60.17.100 Stream #0:0(und): Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s (default) Metadata: creation_time : 2023-12-20T14:25:20.000000Z handler_name : Core Media Audio vendor_id : [0][0][0][0] encoder : Lavc60.32.102 pcm_s16le [out#0/wav @ 0000021b689badc0] video:0kB audio:518kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.014706% size= 518kB time=00:00:16.57 bitrate= 256.1kbits/s speed= 685x C:\Users\sunny.conda\envs\dreamtalk\lib\site-packages\torch\nn\utils\weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm. warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.") Some weights of the model checkpoint at jonatasgrosman/wav2vec2-large-xlsr-53-english were not used when initializing Wav2Vec2Model: ['lm_head.bias', 'lm_head.weight']

  • This IS expected if you are initializing Wav2Vec2Model from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing Wav2Vec2Model from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Traceback (most recent call last): File "C:\Users\sunny\Documents\dreamtalk\inference_for_demo_video.py", line 178, in speech_array, sampling_rate = torchaudio.load(wav_16k_path) File "C:\Users\sunny.conda\envs\dreamtalk\lib\site-packages\torchaudio_backend\utils.py", line 203, in load backend = dispatcher(uri, format, backend) File "C:\Users\sunny.conda\envs\dreamtalk\lib\site-packages\torchaudio_backend\utils.py", line 115, in dispatcher raise RuntimeError(f"Couldn't find appropriate backend to handle uri {uri} and format {format}.") RuntimeError: Couldn't find appropriate backend to handle uri tmp/acknowledgement_english@M030_front_neutral_level1_001@male_face\acknowledgement_english@M030_front_neutral_level1_001@male_face_16K.wav and format None.

(dreamtalk) C:\Users\sunny\Documents\dreamtalk>

chaorenai avatar Jan 04 '24 08:01 chaorenai

您可以尝试 windows系统: pip install soundfile linux系统: pip install sox

参见 #2

YifengMa9 avatar Jan 04 '24 08:01 YifengMa9

感谢,跑通了。不过和sadtalker还不是一个级别啊,加油哦

chaorenai avatar Jan 04 '24 09:01 chaorenai

感谢,跑通了。不过和sadtalker还不是一个级别啊,加油哦

您好,我是在读学生,想学习下这个的模型和环境搭建,可否给我一些指导,谢谢

weiran0129 avatar Jan 21 '24 17:01 weiran0129