vllm部署
有没有大佬可以分享下vllm部署CosyVoice2.0的心路历程啊。
有没有大佬可以分享下vllm配置CosyVoice2.0的心路历程啊。
请问找到vllm怎么使用了吗?
有没有大佬可以分享下vllm配置CosyVoice2.0的心路历程啊。
请问找到vllm怎么使用了吗?
看下里面的requirements.txt,然后按照下面的环境搭建就ok了,把模型推理部分的函数vllm部分改为True, anyio==4.9.0 asteroid-filterbanks==0.4.0 async-timeout==5.0.1 attrs==25.3.0 certifi==2025.7.14 cffi==1.17.1 clearvoice==0.1.1 cloudpickle==3.1.1 comm==0.2.2 compressed-tensors==0.9.4 conformer==0.3.2 cupy-cuda12x==13.5.1 Cython==3.1.2 deepspeed==0.15.1 diffusers==0.29.0 einops==0.8.1 exceptiongroup==1.3.0 fastrlock==0.8.3 filelock==3.18.0 flatbuffers==25.2.10 frozenlist==1.6.0 fsspec==2024.12.0 gguf==0.17.1 grpcio==1.63.2 grpcio-tools==1.63.2 h11==0.16.0 hf-xet==1.1.2 httpcore==1.0.9 httptools==0.6.4 httpx==0.28.1 huggingface-hub==0.32.3 hydra-core==1.3.2 HyperPyYAML==1.2.2 idna==3.10 importlib_metadata==8.7.0 interegular==0.3.3 jsonschema-specifications==2025.4.1 julius==0.2.7 kaldifst==1.7.14 lazy_loader==0.4 librosa==0.10.2.post1 lm-format-enforcer==0.10.11 MarkupSafe==2.1.5 mkl_fft==1.3.11 mkl_random==1.2.8 mkl-service==2.4.0 modelscope==1.20.0 multidict==6.4.4 ninja==1.11.1.4 numpy==1.26.4 nvidia-cublas-cu12==12.6.4.1 nvidia-cuda-cupti-cu12==12.6.80 nvidia-cuda-nvrtc-cu12==12.6.77 nvidia-cuda-runtime-cu12==12.6.77 nvidia-cudnn-cu12==9.5.1.17 nvidia-cufft-cu12==11.3.0.4 nvidia-cufile-cu12==1.11.1.6 nvidia-curand-cu12==10.3.7.77 nvidia-cusolver-cu12==11.7.1.2 nvidia-cusparse-cu12==12.5.4.2 nvidia-cusparselt-cu12==0.6.3 nvidia-nccl-cu12==2.26.2 nvidia-nvjitlink-cu12==12.6.85 nvidia-nvtx-cu12==12.6.77 onnx==1.16.0 onnxruntime-gpu==1.19.2 openai-whisper==20240930 opentelemetry-api==1.35.0 opentelemetry-exporter-otlp==1.35.0 opentelemetry-exporter-otlp-proto-common==1.35.0 opentelemetry-exporter-otlp-proto-grpc==1.35.0 opentelemetry-exporter-otlp-proto-http==1.35.0 opentelemetry-sdk==1.35.0 opentelemetry-semantic-conventions==0.56b0 opentelemetry-semantic-conventions-ai==0.4.11 packaging==24.2 pip==25.1 prometheus_client==0.22.1 prometheus-fastapi-instrumentator==7.1.0 protobuf==5.26.1 pyannote.audio==3.1.0 pyannote.core==5.0.0 pyannote.database==5.1.3 pyannote.metrics==3.2.1 pyannote.pipeline==3.0.1 pyarrow==18.1.0 pyasn1==0.6.1 pyasn1_modules==0.4.2 pycparser==2.22 pydantic==2.11.7 pydantic_core==2.33.2 pydantic-extra-types==2.10.5 pytorch-lightning==2.5.1.post0 pytorch-metric-learning==2.8.1 PyYAML==6.0.2 ray==2.47.1 regex==2024.11.6 requests==2.32.4 safetensors==0.5.3 scikit-learn==1.6.1 scipy==1.12.0 sentencepiece==0.2.0 setuptools==78.1.1 soundfile==0.12.1 speechbrain==1.0.3 starlette==0.41.3 tensorboard==2.14.0 tensorboard-data-server==0.7.2 tensorrt-cu12==10.0.1 tensorrt-cu12-bindings==10.0.1 tensorrt-cu12-libs==10.0.1 tokenizers==0.21.2 torch==2.7.0 torch-audiomentations==0.12.0 torch_pitch_shift==1.2.5 torchaudio==2.7.0 torchinfo==1.8.0 torchmetrics==1.7.2 torchtext==0.18.0 torchvision==0.22.0 tqdm==4.67.1 transformers==4.52.4 typing_extensions==4.14.0 urllib3==2.5.0 uvicorn==0.30.0 uvloop==0.21.0 vllm==0.9.0 websockets==12.0 wheel==0.45.1 xformers==0.0.30 yarl==1.20.0
有没有大佬可以分享下vllm配置CosyVoice2.0的心路历程啊。
请问找到vllm怎么使用了吗?
看下里面的requirements.txt,然后按照下面的环境搭建就ok了,把模型推理部分的函数vllm部分改为True, anyio==4.9.0 asteroid-filterbanks==0.4.0 async-timeout==5.0.1 attrs==25.3.0 certifi==2025.7.14 cffi==1.17.1 clearvoice==0.1.1 cloudpickle==3.1.1 comm==0.2.2 compressed-tensors==0.9.4 conformer==0.3.2 cupy-cuda12x==13.5.1 Cython==3.1.2 deepspeed==0.15.1 diffusers==0.29.0 einops==0.8.1 exceptiongroup==1.3.0 fastrlock==0.8.3 filelock==3.18.0 flatbuffers==25.2.10 frozenlist==1.6.0 fsspec==2024.12.0 gguf==0.17.1 grpcio==1.63.2 grpcio-tools==1.63.2 h11==0.16.0 hf-xet==1.1.2 httpcore==1.0.9 httptools==0.6.4 httpx==0.28.1 huggingface-hub==0.32.3 hydra-core==1.3.2 HyperPyYAML==1.2.2 idna==3.10 importlib_metadata==8.7.0 interegular==0.3.3 jsonschema-specifications==2025.4.1 julius==0.2.7 kaldifst==1.7.14 lazy_loader==0.4 librosa==0.10.2.post1 lm-format-enforcer==0.10.11 MarkupSafe==2.1.5 mkl_fft==1.3.11 mkl_random==1.2.8 mkl-service==2.4.0 modelscope==1.20.0 multidict==6.4.4 ninja==1.11.1.4 numpy==1.26.4 nvidia-cublas-cu12==12.6.4.1 nvidia-cuda-cupti-cu12==12.6.80 nvidia-cuda-nvrtc-cu12==12.6.77 nvidia-cuda-runtime-cu12==12.6.77 nvidia-cudnn-cu12==9.5.1.17 nvidia-cufft-cu12==11.3.0.4 nvidia-cufile-cu12==1.11.1.6 nvidia-curand-cu12==10.3.7.77 nvidia-cusolver-cu12==11.7.1.2 nvidia-cusparse-cu12==12.5.4.2 nvidia-cusparselt-cu12==0.6.3 nvidia-nccl-cu12==2.26.2 nvidia-nvjitlink-cu12==12.6.85 nvidia-nvtx-cu12==12.6.77 onnx==1.16.0 onnxruntime-gpu==1.19.2 openai-whisper==20240930 opentelemetry-api==1.35.0 opentelemetry-exporter-otlp==1.35.0 opentelemetry-exporter-otlp-proto-common==1.35.0 opentelemetry-exporter-otlp-proto-grpc==1.35.0 opentelemetry-exporter-otlp-proto-http==1.35.0 opentelemetry-sdk==1.35.0 opentelemetry-semantic-conventions==0.56b0 opentelemetry-semantic-conventions-ai==0.4.11 packaging==24.2 pip==25.1 prometheus_client==0.22.1 prometheus-fastapi-instrumentator==7.1.0 protobuf==5.26.1 pyannote.audio==3.1.0 pyannote.core==5.0.0 pyannote.database==5.1.3 pyannote.metrics==3.2.1 pyannote.pipeline==3.0.1 pyarrow==18.1.0 pyasn1==0.6.1 pyasn1_modules==0.4.2 pycparser==2.22 pydantic==2.11.7 pydantic_core==2.33.2 pydantic-extra-types==2.10.5 pytorch-lightning==2.5.1.post0 pytorch-metric-learning==2.8.1 PyYAML==6.0.2 ray==2.47.1 regex==2024.11.6 requests==2.32.4 safetensors==0.5.3 scikit-learn==1.6.1 scipy==1.12.0 sentencepiece==0.2.0 setuptools==78.1.1 soundfile==0.12.1 speechbrain==1.0.3 starlette==0.41.3 tensorboard==2.14.0 tensorboard-data-server==0.7.2 tensorrt-cu12==10.0.1 tensorrt-cu12-bindings==10.0.1 tensorrt-cu12-libs==10.0.1 tokenizers==0.21.2 torch==2.7.0 torch-audiomentations==0.12.0 torch_pitch_shift==1.2.5 torchaudio==2.7.0 torchinfo==1.8.0 torchmetrics==1.7.2 torchtext==0.18.0 torchvision==0.22.0 tqdm==4.67.1 transformers==4.52.4 typing_extensions==4.14.0 urllib3==2.5.0 uvicorn==0.30.0 uvloop==0.21.0 vllm==0.9.0 websockets==12.0 wheel==0.45.1 xformers==0.0.30 yarl==1.20.0
先pip install -r requirements.txt再pip install -r 上面的环境
跟着文档做就可以了,需要注意的是,如果之前报错了,导致pretrain模型文件夹(pretrained_models/CosyVoice2-0.5B)下的vllm文件夹已经创建了,就会跳过配置文件的生成步骤 把那个vllm文件夹删了,重来就好了
有没有大佬可以分享下vllm配置CosyVoice2.0的心路历程啊。
请问找到vllm怎么使用了吗?
看下里面的requirements.txt,然后按照下面的环境搭建就ok了,把模型推理部分的函数vllm部分改为True, anyio==4.9.0 asteroid-filterbanks==0.4.0 async-timeout==5.0.1 attrs==25.3.0 certifi==2025.7.14 cffi==1.17.1 clearvoice==0.1.1 cloudpickle==3.1.1 comm==0.2.2 compressed-tensors==0.9.4 conformer==0.3.2 cupy-cuda12x==13.5.1 Cython==3.1.2 deepspeed==0.15.1 diffusers==0.29.0 einops==0.8.1 exceptiongroup==1.3.0 fastrlock==0.8.3 filelock==3.18.0 flatbuffers==25.2.10 frozenlist==1.6.0 fsspec==2024.12.0 gguf==0.17.1 grpcio==1.63.2 grpcio-tools==1.63.2 h11==0.16.0 hf-xet==1.1.2 httpcore==1.0.9 httptools==0.6.4 httpx==0.28.1 huggingface-hub==0.32.3 hydra-core==1.3.2 HyperPyYAML==1.2.2 idna==3.10 importlib_metadata==8.7.0 interegular==0.3.3 jsonschema-specifications==2025.4.1 julius==0.2.7 kaldifst==1.7.14 lazy_loader==0.4 librosa==0.10.2.post1 lm-format-enforcer==0.10.11 MarkupSafe==2.1.5 mkl_fft==1.3.11 mkl_random==1.2.8 mkl-service==2.4.0 modelscope==1.20.0 multidict==6.4.4 ninja==1.11.1.4 numpy==1.26.4 nvidia-cublas-cu12==12.6.4.1 nvidia-cuda-cupti-cu12==12.6.80 nvidia-cuda-nvrtc-cu12==12.6.77 nvidia-cuda-runtime-cu12==12.6.77 nvidia-cudnn-cu12==9.5.1.17 nvidia-cufft-cu12==11.3.0.4 nvidia-cufile-cu12==1.11.1.6 nvidia-curand-cu12==10.3.7.77 nvidia-cusolver-cu12==11.7.1.2 nvidia-cusparse-cu12==12.5.4.2 nvidia-cusparselt-cu12==0.6.3 nvidia-nccl-cu12==2.26.2 nvidia-nvjitlink-cu12==12.6.85 nvidia-nvtx-cu12==12.6.77 onnx==1.16.0 onnxruntime-gpu==1.19.2 openai-whisper==20240930 opentelemetry-api==1.35.0 opentelemetry-exporter-otlp==1.35.0 opentelemetry-exporter-otlp-proto-common==1.35.0 opentelemetry-exporter-otlp-proto-grpc==1.35.0 opentelemetry-exporter-otlp-proto-http==1.35.0 opentelemetry-sdk==1.35.0 opentelemetry-semantic-conventions==0.56b0 opentelemetry-semantic-conventions-ai==0.4.11 packaging==24.2 pip==25.1 prometheus_client==0.22.1 prometheus-fastapi-instrumentator==7.1.0 protobuf==5.26.1 pyannote.audio==3.1.0 pyannote.core==5.0.0 pyannote.database==5.1.3 pyannote.metrics==3.2.1 pyannote.pipeline==3.0.1 pyarrow==18.1.0 pyasn1==0.6.1 pyasn1_modules==0.4.2 pycparser==2.22 pydantic==2.11.7 pydantic_core==2.33.2 pydantic-extra-types==2.10.5 pytorch-lightning==2.5.1.post0 pytorch-metric-learning==2.8.1 PyYAML==6.0.2 ray==2.47.1 regex==2024.11.6 requests==2.32.4 safetensors==0.5.3 scikit-learn==1.6.1 scipy==1.12.0 sentencepiece==0.2.0 setuptools==78.1.1 soundfile==0.12.1 speechbrain==1.0.3 starlette==0.41.3 tensorboard==2.14.0 tensorboard-data-server==0.7.2 tensorrt-cu12==10.0.1 tensorrt-cu12-bindings==10.0.1 tensorrt-cu12-libs==10.0.1 tokenizers==0.21.2 torch==2.7.0 torch-audiomentations==0.12.0 torch_pitch_shift==1.2.5 torchaudio==2.7.0 torchinfo==1.8.0 torchmetrics==1.7.2 torchtext==0.18.0 torchvision==0.22.0 tqdm==4.67.1 transformers==4.52.4 typing_extensions==4.14.0 urllib3==2.5.0 uvicorn==0.30.0 uvloop==0.21.0 vllm==0.9.0 websockets==12.0 wheel==0.45.1 xformers==0.0.30 yarl==1.20.0
我启动的是cosyvoice2的模型CosyVoice2-0.5B,启动和合成没有保存,但是语音发音是乱的。 CosyVoice2(args.model_dir, load_jit=True, load_trt=True, load_vllm=True, fp16=True) 我已经按照官方的版本来安装,发现合成出来还是语音混乱的
请问你这边有语音乱音的情况吗?我的问题在这个贴:#1601
我刚刚搞好一个新环境,包括VLLM 系统ubuntu20.04 显卡3060 CUDA12.6 python==3.10
新环境我是先执行如下命令安装torch(如果需要vllm的话,应该安装2.7.0的版本) pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126
然后执行pip install -r requirements.txt(注释掉#openai-whisper==20231117,然后单独安装,这个包不知道为什么会不停地安装各种torch版本,至少我这是这样),其它包的安装请参照项目提示
我测试了一下正常语音推理,一次成功,接下来安装vllm
当我执行pip install vllm时,我发现torch下载安装2.7.0的版本,安装好后,参照vllm_example.py导入相关包和处理,执行没有问题,就是初始化时要等很久,至少有3~5分钟的样子
我刚刚搞好一个新环境,包括VLLM 系统ubuntu20.04 显卡3060 CUDA12.6 python==3.10
新环境我是先执行如下命令安装torch(如果需要vllm的话,应该安装2.7.0的版本) pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126
然后执行pip install -r requirements.txt(注释掉#openai-whisper==20231117,然后单独安装,这个包不知道为什么会不停地安装各种torch版本,至少我这是这样),其它包的安装请参照项目提示
我测试了一下正常语音推理,一次成功,接下来安装vllm
当我执行pip install vllm时,我发现torch下载安装2.7.0的版本,安装好后,参照vllm_example.py导入相关包和处理,执行没有问题,就是初始化时要等很久,至少有3~5分钟的样子
请问你使用的 vLLM 是哪个版本,是 0.9.0 吗?
想问一下 cosyvice 至此vllm 部署吗,推理使用vllm 没有问题
This issue is stale because it has been open for 30 days with no activity.