fish-speech
fish-speech copied to clipboard
--compile crack
Self Checks
- [x] This template is only for bug reports. For questions, please visit Discussions.
- [x] I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文 日本語 Portuguese (Brazil)
- [x] I have searched for existing issues, including closed ones. Search issues
- [x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [x] Please do not modify this template and fill in all required fields.
Cloud or Self Hosted
Self Hosted (Source)
Environment Details
Windows11 - WSL (ubuntu)
AMD 5090X + 128G RAM RTX 3090 + 24G vRAM
Followed all the documents.
- conda python 3.10
-
pip install -e .
Steps to Reproduce
python -m tools.api_server \
--listen 0.0.0.0:8080 \
--llama-checkpoint-path "checkpoints/fish-speech-1.5" \
--decoder-checkpoint-path "checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth" \
--decoder-config-name firefly_gan_vq \
--compile
✔️ Expected Behavior
launch an API server locally.
❌ Actual Behavior
(fish-speech) xxx@win11:/mnt/d/lab/fish-speech$ ./server.sh
INFO: Started server process [1354]
INFO: Waiting for application startup.
2025-01-26 22:27:15.968 | INFO | fish_speech.models.text2semantic.inference:load_model:681 - Restored model from checkpoint
2025-01-26 22:27:15.968 | INFO | fish_speech.models.text2semantic.inference:load_model:687 - Using DualARTransformer
2025-01-26 22:27:15.968 | INFO | fish_speech.models.text2semantic.inference:load_model:695 - Compiling function...
2025-01-26 22:27:16.587 | INFO | tools.server.model_manager:load_llama_model:99 - LLAMA model loaded.
/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:445: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:630: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/vector_quantize_pytorch/finite_scalar_quantization.py:147: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/vector_quantize_pytorch/lookup_free_quantization.py:209: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
Uncaught exception in compile_worker subprocess
Traceback (most recent call last):
File "/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/torch/_inductor/compile_worker/__main__.py", line 38, in main
pre_fork_setup()
File "/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/torch/_inductor/async_compile.py", line 62, in pre_fork_setup
from triton.compiler.compiler import triton_key
File "/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/triton/__init__.py", line 8, in <module>
from .runtime import (
File "/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/triton/runtime/__init__.py", line 1, in <module>
from .autotuner import (Autotuner, Config, Heuristics, OutOfResources, autotune, heuristics)
File "/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/triton/runtime/autotuner.py", line 7, in <module>
from ..testing import do_bench
File "/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/triton/testing.py", line 7, in <module>
from . import language as tl
File "/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/triton/language/__init__.py", line 6, in <module>
from .standard import (
2025-01-26 22:27:22.006 | INFO | fish_speech.models.vqgan.inference:load_model:46 - Loaded model: <All keys matched successfully>
2025-01-26 22:27:22.006 | INFO | tools.server.model_manager:load_decoder_model:107 - Decoder model loaded.
2025-01-26 22:27:22.016 | INFO | fish_speech.models.text2semantic.inference:generate_long:788 - Encoded text: Hello world.
2025-01-26 22:27:22.016 | INFO | fish_speech.models.text2semantic.inference:generate_long:806 - Generating sentence 1/1 of sample 1/1
0%| | 0/1023 [00:00<?, ?it/s]/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/contextlib.py:103: FutureWarning: `torch.backends.cuda.sdp_kernel()` is deprecated. In the future, this context manager will be removed. Please see `torch.nn.attention.sdpa_kernel()` for the new context manager, with updated signature.
self.gen = func(*args, **kwds)
0%| | 0/1023 [00:16<?, ?it/s]
ERROR: Traceback (most recent call last):
File "/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/kui/asgi/lifespan.py", line 36, in __call__
await result
File "/mnt/d/lab/fish-speech/tools/api_server.py", line 78, in initialize_app
app.state.model_manager = ModelManager(
File "/mnt/d/lab/fish-speech/tools/server/model_manager.py", line 65, in __init__
self.warm_up(self.tts_inference_engine)
File "/mnt/d/lab/fish-speech/tools/server/model_manager.py", line 121, in warm_up
list(inference(request, tts_inference_engine))
if I removed the --compile
it works, but I think I need to use the --compile to enhance the toks/s...
I've encountered exactly the same problem as you! I'm looking forward to having it solved. 😀
i have the same question, anyone know how to fix it
i have the same question, anyone know how to fix it
i have the same question, anyone know how to fix it