生成的音频是胡言乱语
下载最新代码,下载最新模型:https://www.modelscope.cn/models/iic/CosyVoice2-0.5B/files,运行示例合成代码,合成结果是胡言乱语。
你把transformers的版本降降,按它要求的来。
楼上有道理,按原始的transformers效果好一点,但是不能完全杜绝。多次测试,发现和随机数种子有巨大相关性。
降低版本后,解决了
如果使用vllm的话,transformers应该用哪个版本呢?
如果使用vllm的话,transformers应该用哪个版本呢?
兄弟,这个解决了吗?求告知
transformers有比较激进的版本要求,高了低了都可能导致生成莫名其妙的音频,也有可能生成长时间空音频,建议按照官方版本完全一致
我运行了vllm_example.py,完全不知道它干什么。 下面是我的版本列表。我的实验表明,刚运行初次加载模型,生成的语音是否正常和随机数种子相关,但设置随机数种子后再生成基本正常,包括更改种子数。但是,用内置的“中文女”像是女音,“中文男”还像女音…… ` (cosyvoice) webui@1324eb3a3bc4:/workspace/CosyVoice$ pip list Package Version
absl-py 2.3.1 aiofiles 23.2.1 aiohappyeyeballs 2.6.1 aiohttp 3.12.14 aiosignal 1.4.0 aliyun-python-sdk-core 2.16.0 aliyun-python-sdk-kms 2.16.5 annotated-types 0.7.0 antlr4-python3-runtime 4.9.3 anyio 4.9.0 archspec 0.2.5 astor 0.8.1 asttokens 3.0.0 astunparse 1.6.3 attrs 25.3.0 audioread 3.0.1 beautifulsoup4 4.13.4 blake3 1.0.5 blinker 1.9.0 boltons 24.0.0 Brotli 1.1.0 cachetools 6.1.0 cbor2 5.6.5 certifi 2025.4.26 cffi 1.17.1 chardet 5.2.0 charset-normalizer 3.4.2 click 8.2.1 cloudpickle 3.1.1 cmake 4.0.2 colorama 0.4.6 coloredlogs 15.0.1 compressed-tensors 0.10.2 conda 25.5.0 conda-build 25.5.0 conda_index 0.6.1 conda-libmamba-solver 25.3.0 conda-package-handling 2.4.0 conda_package_streaming 0.11.0 conformer 0.3.2 contourpy 1.3.3 crcmod 1.7 cryptography 45.0.5 cupy-cuda12x 13.5.1 cycler 0.12.1 Cython 3.1.2 decorator 5.2.1 deepspeed 0.15.1 depyf 0.19.0 diffusers 0.34.0 dill 0.4.0 diskcache 5.6.3 distro 1.9.0 dnspython 2.7.0 editdistance 0.8.1 einops 0.8.1 email_validator 2.2.0 evalidate 2.0.5 exceptiongroup 1.3.0 executing 2.2.0 expecttest 0.3.0 fastapi 0.115.6 fastapi-cli 0.0.8 fastapi-cloud-cli 0.1.5 fastrlock 0.8.3 ffmpy 0.6.1 filelock 3.18.0 Flask 3.1.1 flask-cors 6.0.1 flatbuffers 25.2.10 fonttools 4.59.0 frozendict 2.4.6 frozenlist 1.7.0 fsspec 2025.5.1 funasr 1.2.6 gdown 5.1.0 gguf 0.17.1 gradio 5.4.0 gradio_client 1.4.2 grpcio 1.57.0 grpcio-tools 1.57.0 h11 0.16.0 h2 4.2.0 hf-xet 1.1.5 hjson 3.1.0 hpack 4.1.0 httpcore 1.0.9 httptools 0.6.4 httpx 0.28.1 huggingface-hub 0.34.1 humanfriendly 10.0 hydra-core 1.3.2 hyperframe 6.1.0 HyperPyYAML 1.2.2 hypothesis 6.135.0 idna 3.10 importlib_metadata 8.7.0 importlib_resources 6.5.2 inflect 7.3.1 interegular 0.3.3 ipython 9.3.0 ipython_pygments_lexers 1.1.1 itsdangerous 2.2.0 jaconv 0.4.0 jamo 0.4.1 jedi 0.19.2 jieba 0.42.1 Jinja2 3.1.6 jiter 0.10.0 jmespath 0.10.0 joblib 1.5.1 jsonpatch 1.33 jsonpointer 3.0.0 jsonschema 4.24.0 jsonschema-specifications 2025.4.1 kaldifst 1.7.14 kaldiio 2.18.1 kiwisolver 1.4.8 lark 1.2.2 lazy_loader 0.4 libarchive-c 5.3 libmambapy 2.1.1 librosa 0.10.2 lief 0.16.4 lightning 2.5.2 lightning-utilities 0.15.0 lintrunner 0.12.7 llguidance 0.7.30 llvmlite 0.44.0 lm-format-enforcer 0.10.11 Markdown 3.8.2 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matplotlib 3.7.5 matplotlib-inline 0.1.7 mdurl 0.1.2 menuinst 2.2.0 mistral_common 1.8.3 modelscope 1.20.0 more-itertools 10.7.0 mpmath 1.3.0 msgpack 1.1.0 msgspec 0.19.0 multidict 6.6.3 networkx 3.5 ninja 1.11.1.4 numba 0.61.2 numpy 1.26.4 nvidia-cublas-cu12 12.8.3.14 nvidia-cuda-cupti-cu12 12.8.57 nvidia-cuda-nvrtc-cu12 12.8.61 nvidia-cuda-runtime-cu12 12.8.57 nvidia-cudnn-cu12 9.7.1.26 nvidia-cufft-cu12 11.3.3.41 nvidia-cufile-cu12 1.13.0.11 nvidia-curand-cu12 10.3.9.55 nvidia-cusolver-cu12 11.7.2.55 nvidia-cusparse-cu12 12.5.7.53 nvidia-cusparselt-cu12 0.6.3 nvidia-nccl-cu12 2.26.2 nvidia-nvjitlink-cu12 12.8.61 nvidia-nvtx-cu12 12.8.55 omegaconf 2.3.0 onnx 1.17.0 onnxruntime 1.22.1 onnxruntime-gpu 1.22.0 openai 1.90.0 openai-whisper 20250625 opencv-python-headless 4.11.0.86 optree 0.16.0 orjson 3.11.1 oss2 2.19.1 outlines_core 0.2.10 packaging 25.0 pandas 2.3.1 parso 0.8.4 partial-json-parser 0.2.1.1.post6 pexpect 4.9.0 pickleshare 0.7.5 pillow 11.0.0 pip 24.0 pkginfo 1.12.1.2 pkgutil_resolve_name 1.3.10 platformdirs 4.3.8 pluggy 1.5.0 pooch 1.8.2 prometheus_client 0.22.1 prometheus-fastapi-instrumentator 7.1.0 prompt_toolkit 3.0.51 propcache 0.3.2 protobuf 4.25.0 psutil 7.0.0 ptyprocess 0.7.0 pure_eval 0.2.3 py-cpuinfo 9.0.0 pyarrow 21.0.0 pybase64 1.4.2 pycosat 0.6.6 pycountry 24.6.1 pycparser 2.22 pycryptodome 3.23.0 pydantic 2.10.6 pydantic_core 2.27.2 pydantic-extra-types 2.10.5 pydub 0.25.1 Pygments 2.19.1 pynndescent 0.5.13 pyparsing 3.2.3 PySocks 1.7.1 python-dateutil 2.9.0.post0 python-dotenv 1.1.1 python-etcd 0.4.5 python-json-logger 3.3.0 python-multipart 0.0.12 pytorch-lightning 2.5.2 pytorch-wpe 0.0.1 pytz 2025.2 pyworld 0.3.4 PyYAML 6.0.2 pyzmq 27.0.0 ray 2.48.0 referencing 0.36.2 regex 2024.11.6 requests 2.32.3 rich 13.7.1 rich-toolkit 0.14.9 rignore 0.6.4 rpds-py 0.25.1 ruamel.yaml 0.18.12 ruamel.yaml.clib 0.2.8 ruff 0.12.5 safehttpx 0.1.6 safetensors 0.5.3 scikit-learn 1.7.1 scipy 1.16.1 semantic-version 2.10.0 sentencepiece 0.2.0 sentry-sdk 2.34.1 setuptools 65.5.0 shellingham 1.5.4 six 1.17.0 sniffio 1.3.1 sortedcontainers 2.4.0 soundfile 0.12.1 soupsieve 2.7 soxr 0.5.0.post1 stack_data 0.6.3 starlette 0.41.3 sympy 1.14.0 tensorboard 2.20.0 tensorboard-data-server 0.7.2 tensorboardX 2.6.4 tensorrt-cu12 10.13.0.35 tensorrt_cu12_bindings 10.13.0.35 tensorrt_cu12_libs 10.13.0.35 threadpoolctl 3.6.0 tiktoken 0.9.0 tokenizers 0.21.4 tomlkit 0.12.0 torch 2.7.1+cu128 torch-complex 0.4.4 torchaudio 2.7.1+cu128 torchelastic 0.2.2 torchmetrics 1.8.0 torchvision 0.22.1+cu128 tqdm 4.67.1 traitlets 5.14.3 transformers 4.54.1 triton 3.3.1 truststore 0.10.1 typeguard 4.4.4 typer 0.16.0 types-dataclasses 0.6.6 typing_extensions 4.14.0 tzdata 2025.2 umap-learn 0.5.9.post2 urllib3 2.4.0 uvicorn 0.30.0 uvloop 0.21.0 vllm 0.10.0 watchfiles 1.1.0 wcwidth 0.2.13 websockets 12.0 Werkzeug 3.1.3 wetext 0.0.8 wget 3.2 wheel 0.45.1 xformers 0.0.31 xgrammar 0.1.21 yarl 1.20.1 zipp 3.22.0 zstandard 0.23.0
`
我运行了vllm_example.py,完全不知道它干什么。 下面是我的版本列表。我的实验表明,刚运行初次加载模型,生成的语音是否正常和随机数种子相关,但设置随机数种子后再生成基本正常,包括更改种子数。但是,用内置的“中文女”像是女音,“中文男”还像女音…… ` (cosyvoice) webui@1324eb3a3bc4:/workspace/CosyVoice$ pip list Package Version
absl-py 2.3.1 aiofiles 23.2.1 aiohappyeyeballs 2.6.1 aiohttp 3.12.14 aiosignal 1.4.0 aliyun-python-sdk-core 2.16.0 aliyun-python-sdk-kms 2.16.5 annotated-types 0.7.0 antlr4-python3-runtime 4.9.3 anyio 4.9.0 archspec 0.2.5 astor 0.8.1 asttokens 3.0.0 astunparse 1.6.3 attrs 25.3.0 audioread 3.0.1 beautifulsoup4 4.13.4 blake3 1.0.5 blinker 1.9.0 boltons 24.0.0 Brotli 1.1.0 cachetools 6.1.0 cbor2 5.6.5 certifi 2025.4.26 cffi 1.17.1 chardet 5.2.0 charset-normalizer 3.4.2 click 8.2.1 cloudpickle 3.1.1 cmake 4.0.2 colorama 0.4.6 coloredlogs 15.0.1 compressed-tensors 0.10.2 conda 25.5.0 conda-build 25.5.0 conda_index 0.6.1 conda-libmamba-solver 25.3.0 conda-package-handling 2.4.0 conda_package_streaming 0.11.0 conformer 0.3.2 contourpy 1.3.3 crcmod 1.7 cryptography 45.0.5 cupy-cuda12x 13.5.1 cycler 0.12.1 Cython 3.1.2 decorator 5.2.1 deepspeed 0.15.1 depyf 0.19.0 diffusers 0.34.0 dill 0.4.0 diskcache 5.6.3 distro 1.9.0 dnspython 2.7.0 editdistance 0.8.1 einops 0.8.1 email_validator 2.2.0 evalidate 2.0.5 exceptiongroup 1.3.0 executing 2.2.0 expecttest 0.3.0 fastapi 0.115.6 fastapi-cli 0.0.8 fastapi-cloud-cli 0.1.5 fastrlock 0.8.3 ffmpy 0.6.1 filelock 3.18.0 Flask 3.1.1 flask-cors 6.0.1 flatbuffers 25.2.10 fonttools 4.59.0 frozendict 2.4.6 frozenlist 1.7.0 fsspec 2025.5.1 funasr 1.2.6 gdown 5.1.0 gguf 0.17.1 gradio 5.4.0 gradio_client 1.4.2 grpcio 1.57.0 grpcio-tools 1.57.0 h11 0.16.0 h2 4.2.0 hf-xet 1.1.5 hjson 3.1.0 hpack 4.1.0 httpcore 1.0.9 httptools 0.6.4 httpx 0.28.1 huggingface-hub 0.34.1 humanfriendly 10.0 hydra-core 1.3.2 hyperframe 6.1.0 HyperPyYAML 1.2.2 hypothesis 6.135.0 idna 3.10 importlib_metadata 8.7.0 importlib_resources 6.5.2 inflect 7.3.1 interegular 0.3.3 ipython 9.3.0 ipython_pygments_lexers 1.1.1 itsdangerous 2.2.0 jaconv 0.4.0 jamo 0.4.1 jedi 0.19.2 jieba 0.42.1 Jinja2 3.1.6 jiter 0.10.0 jmespath 0.10.0 joblib 1.5.1 jsonpatch 1.33 jsonpointer 3.0.0 jsonschema 4.24.0 jsonschema-specifications 2025.4.1 kaldifst 1.7.14 kaldiio 2.18.1 kiwisolver 1.4.8 lark 1.2.2 lazy_loader 0.4 libarchive-c 5.3 libmambapy 2.1.1 librosa 0.10.2 lief 0.16.4 lightning 2.5.2 lightning-utilities 0.15.0 lintrunner 0.12.7 llguidance 0.7.30 llvmlite 0.44.0 lm-format-enforcer 0.10.11 Markdown 3.8.2 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matplotlib 3.7.5 matplotlib-inline 0.1.7 mdurl 0.1.2 menuinst 2.2.0 mistral_common 1.8.3 modelscope 1.20.0 more-itertools 10.7.0 mpmath 1.3.0 msgpack 1.1.0 msgspec 0.19.0 multidict 6.6.3 networkx 3.5 ninja 1.11.1.4 numba 0.61.2 numpy 1.26.4 nvidia-cublas-cu12 12.8.3.14 nvidia-cuda-cupti-cu12 12.8.57 nvidia-cuda-nvrtc-cu12 12.8.61 nvidia-cuda-runtime-cu12 12.8.57 nvidia-cudnn-cu12 9.7.1.26 nvidia-cufft-cu12 11.3.3.41 nvidia-cufile-cu12 1.13.0.11 nvidia-curand-cu12 10.3.9.55 nvidia-cusolver-cu12 11.7.2.55 nvidia-cusparse-cu12 12.5.7.53 nvidia-cusparselt-cu12 0.6.3 nvidia-nccl-cu12 2.26.2 nvidia-nvjitlink-cu12 12.8.61 nvidia-nvtx-cu12 12.8.55 omegaconf 2.3.0 onnx 1.17.0 onnxruntime 1.22.1 onnxruntime-gpu 1.22.0 openai 1.90.0 openai-whisper 20250625 opencv-python-headless 4.11.0.86 optree 0.16.0 orjson 3.11.1 oss2 2.19.1 outlines_core 0.2.10 packaging 25.0 pandas 2.3.1 parso 0.8.4 partial-json-parser 0.2.1.1.post6 pexpect 4.9.0 pickleshare 0.7.5 pillow 11.0.0 pip 24.0 pkginfo 1.12.1.2 pkgutil_resolve_name 1.3.10 platformdirs 4.3.8 pluggy 1.5.0 pooch 1.8.2 prometheus_client 0.22.1 prometheus-fastapi-instrumentator 7.1.0 prompt_toolkit 3.0.51 propcache 0.3.2 protobuf 4.25.0 psutil 7.0.0 ptyprocess 0.7.0 pure_eval 0.2.3 py-cpuinfo 9.0.0 pyarrow 21.0.0 pybase64 1.4.2 pycosat 0.6.6 pycountry 24.6.1 pycparser 2.22 pycryptodome 3.23.0 pydantic 2.10.6 pydantic_core 2.27.2 pydantic-extra-types 2.10.5 pydub 0.25.1 Pygments 2.19.1 pynndescent 0.5.13 pyparsing 3.2.3 PySocks 1.7.1 python-dateutil 2.9.0.post0 python-dotenv 1.1.1 python-etcd 0.4.5 python-json-logger 3.3.0 python-multipart 0.0.12 pytorch-lightning 2.5.2 pytorch-wpe 0.0.1 pytz 2025.2 pyworld 0.3.4 PyYAML 6.0.2 pyzmq 27.0.0 ray 2.48.0 referencing 0.36.2 regex 2024.11.6 requests 2.32.3 rich 13.7.1 rich-toolkit 0.14.9 rignore 0.6.4 rpds-py 0.25.1 ruamel.yaml 0.18.12 ruamel.yaml.clib 0.2.8 ruff 0.12.5 safehttpx 0.1.6 safetensors 0.5.3 scikit-learn 1.7.1 scipy 1.16.1 semantic-version 2.10.0 sentencepiece 0.2.0 sentry-sdk 2.34.1 setuptools 65.5.0 shellingham 1.5.4 six 1.17.0 sniffio 1.3.1 sortedcontainers 2.4.0 soundfile 0.12.1 soupsieve 2.7 soxr 0.5.0.post1 stack_data 0.6.3 starlette 0.41.3 sympy 1.14.0 tensorboard 2.20.0 tensorboard-data-server 0.7.2 tensorboardX 2.6.4 tensorrt-cu12 10.13.0.35 tensorrt_cu12_bindings 10.13.0.35 tensorrt_cu12_libs 10.13.0.35 threadpoolctl 3.6.0 tiktoken 0.9.0 tokenizers 0.21.4 tomlkit 0.12.0 torch 2.7.1+cu128 torch-complex 0.4.4 torchaudio 2.7.1+cu128 torchelastic 0.2.2 torchmetrics 1.8.0 torchvision 0.22.1+cu128 tqdm 4.67.1 traitlets 5.14.3 transformers 4.54.1 triton 3.3.1 truststore 0.10.1 typeguard 4.4.4 typer 0.16.0 types-dataclasses 0.6.6 typing_extensions 4.14.0 tzdata 2025.2 umap-learn 0.5.9.post2 urllib3 2.4.0 uvicorn 0.30.0 uvloop 0.21.0 vllm 0.10.0 watchfiles 1.1.0 wcwidth 0.2.13 websockets 12.0 Werkzeug 3.1.3 wetext 0.0.8 wget 3.2 wheel 0.45.1 xformers 0.0.31 xgrammar 0.1.21 yarl 1.20.1 zipp 3.22.0 zstandard 0.23.0
`
请尝试transformers==4.40.1
This issue is stale because it has been open for 30 days with no activity.
我启动的是cosyvoice2的模型CosyVoice2-0.5B,启动和合成没有保存,但是语音发音是乱的。 版本:transformers 4.51.3 vllm 0.9.0,按照官方版本依然是乱音 CosyVoice2(args.model_dir, load_jit=True, load_trt=True, load_vllm=True, fp16=True)
请问你这边有语音乱音的情况吗?我的问题在这个贴:#1601
Thank you ! I have received your e-mail.Best regards!