PermissionError and Segmentation Fault with torch.compile (Inductor) during Model Warm-up when --complie is set
Self Checks
- [x] This template is only for bug reports. For questions, please visit Discussions.
- [x] I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文 日本語 Portuguese (Brazil)
- [x] I have searched for existing issues, including closed ones. Search issues
- [x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [x] Please do not modify this template and fill in all required fields.
Cloud or Self Hosted
Self Hosted (Source)
Environment Details
- OS: WSL Ubuntu 22.04
- Python Version: 3.10.16
- PyTorch Version: * 2.4.1+cu121
- CUDA Version: nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2021 NVIDIA Corporation Built on Thu_Nov_18_09:45:30_PST_2021 Cuda compilation tools, release 11.5, V11.5.119 Build cuda_11.5.r11.5/compiler.30672275_0
- GPU: NVIDIA GeForce RTX 3060
- Fish-Speech Version: 1.5
- torch version: 2.4.1
- torchvision version : 0.19.1+cu121
- torchaudio version 0.19.1+cu121
- torchtext version: 2.4.1+cu121
Steps to Reproduce
I started up successfully before with --complie and it works fine but dont why this doesnt work this time:
-
Set up the fish-speech environment (including dependencies).
-
Run the API server with the
--compileflag:python -m tools.api_server \ --listen 0.0.0.0:7865 \ --llama-checkpoint-path checkpoints/fish-speech-1.5 \ --decoder-checkpoint-path checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth \ --decoder-config-name firefly_gan_vq \ --compile
✔️ Expected Behavior
Application start up successfully
❌ Actual Behavior
I'm encountering a PermissionError followed by a Segmentation Fault when running a TTS model (fish-speech) with torch.compile using the Inductor backend. The error occurs during the model warm-up phase (model_manager.warm_up). The issue seems related to file access permissions in the temporary directory used by TorchInductor, even after attempts to adjust permissions and use shutil.move as a workaround which referred from https://github.com/unslothai/unsloth/issues/1999.
Error Log:
run as user:
(tts) koma@LAPTOP-UFED71OD:~/fish-speech$ python -m tools.api_server --listen 0.0.0.0:7865 --llama-checkpoint-path checkpoints/fish-speech-1.5 --decoder-checkpoint-path checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth --decoder-config-name firefly_gan_vq --compile
INFO: Started server process [1330]
INFO: Waiting for application startup.
2025-03-23 11:28:20.578 | INFO | fish_speech.models.text2semantic.inference:load_model:681 - Restored model from checkpoint
2025-03-23 11:28:20.578 | INFO | fish_speech.models.text2semantic.inference:load_model:687 - Using DualARTransformer
2025-03-23 11:28:20.578 | INFO | fish_speech.models.text2semantic.inference:load_model:695 - Compiling function...
2025-03-23 11:28:22.062 | INFO | tools.server.model_manager:load_llama_model:99 - LLAMA model loaded.
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:445: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:630: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/finite_scalar_quantization.py:147: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/lookup_free_quantization.py:209: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
2025-03-23 11:28:27.765 | INFO | fish_speech.models.vqgan.inference:load_model:46 - Loaded model: <All keys matched successfully>
2025-03-23 11:28:27.766 | INFO | tools.server.model_manager:load_decoder_model:107 - Decoder model loaded.
2025-03-23 11:28:27.795 | INFO | fish_speech.models.text2semantic.inference:generate_long:788 - Encoded text: Hello world.
2025-03-23 11:28:27.797 | INFO | fish_speech.models.text2semantic.inference:generate_long:806 - Generating sentence 1/1 of sample 1/1
0%| | 0/1023 [00:00<?, ?it/s]/home/koma/.conda/envs/tts/lib/python3.10/contextlib.py:103: FutureWarning: `torch.backends.cuda.sdp_kernel()` is deprecated. In the future, this context manager will be removed. Please see `torch.nn.attention.sdpa_kernel()` for the new context manager, with updated signature.
self.gen = func(*args, **kwds)
W0323 11:30:52.749000 140116945266240 torch/fx/experimental/symbolic_shapes.py:4449] [0/0] xindex is not in var_ranges, defaulting to unknown range.
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ttir.tmp.pid_1465_a4aea7d5-c578-456b-a4e9-f2c3298911a3 -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ttir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ttgir.tmp.pid_1465_15dc99d5-a380-438c-b1da-c05e1e98490f -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ttgir
Segmentation fault (core dumped) /tmp/torchinductor_koma/lo/.1330.140116945266240.tmp -> /tmp/torchinductor_koma/lo/clogj3r7bsakyus6wz3yqefmjlhto65qemf2qfe4ns7mp52pxd6n.py
0%| | 0/1023 [02:41<?, ?it/s]
ERROR: Traceback (most recent call last):
File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/kui/asgi/lifespan.py", line 36, in __call__
await result
File "/home/koma/fish-speech/tools/api_server.py", line 82, in initialize_app
app.state.model_manager = ModelManager(
File "/home/koma/fish-speech/tools/server/model_manager.py", line 65, in __init__
self.warm_up(self.tts_inference_engine)
File "/home/koma/fish-speech/tools/server/model_manager.py", line 121, in warm_up
list(inference(request, tts_inference_engine))
File "/home/koma/fish-speech/tools/server/inference.py", line 25, in inference_wrapper
raise HTTPException(
baize.exceptions.HTTPException: (500, '\'backend=\\\'inductor\\\' raised:\\nPermissionError: [Errno 13] Permission denied: \\\'/tmp/torchinductor_koma/lo/.1330.140116945266240.tmp\\\' -> \\\'/tmp/torchinductor_koma/lo/clogj3r7bsakyus6wz3yqefmjlhto65qemf2qfe4ns7mp52pxd6n.py\\\'\\n\\nSet TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information\\n\\n\\nYou can suppress this exception and fall back to eager by setting:\\n import torch._dynamo\\n torch._dynamo.config.suppress_errors = True\\n\'')
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.llir.tmp.pid_1465_7cc0f06c-2609-4d20-a45e-ae625a161c39 -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.llir
ERROR: Application startup failed. Exiting.
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ptx.tmp.pid_1465_aaab7612-5f9d-4669-bf5e-1f685a9eb20a -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ptx
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.cubin.tmp.pid_1465_7e66aa04-7c5f-4624-a304-84f1c6297651 -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.cubin
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.json.tmp.pid_1465_c426a30f-3d7c-4382-ae50-3281c8bfc682 -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/__grp__triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.json.tmp.pid_1465_3c4ed8ff-5f40-46aa-bad2-06ea8a68ff97 -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/__grp__triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttir.tmp.pid_1465_8b7a4a8b-f4fe-4a5d-b211-dcb7b802a116 -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttgir.tmp.pid_1465_09c6789f-443a-4516-8364-6ed1dca9a77e -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttgir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttir.tmp.pid_1463_63dde67e-2ce1-4689-9dd6-201faa09fbbd -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttgir.tmp.pid_1463_937aec16-747f-4c7d-ad14-03e631c2c8fb -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttgir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.llir.tmp.pid_1465_c3a159f6-b382-4c1a-a151-ad6ec86f6930 -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.llir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ptx.tmp.pid_1465_b79ce8db-924e-43b3-8d93-efc4e1b15264 -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ptx
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.llir.tmp.pid_1463_ab90bde9-214a-4ff5-a32a-3d6a7ef50ced -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.llir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ptx.tmp.pid_1463_7a0e89a4-53ff-4fd9-835b-1a8e88d92ee6 -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ptx
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.cubin.tmp.pid_1463_2b7ccc2a-537c-4bc7-9c26-1428a6db4caa -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.cubin
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json.tmp.pid_1463_f222d114-66c1-4eca-a03a-0bb33e9bd35d -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json.tmp.pid_1463_739c28d6-8df0-4417-b096-a23454ab5497 -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.cubin.tmp.pid_1465_65ecf108-ed86-4aca-83ff-afb63f98f017 -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.cubin
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json.tmp.pid_1465_87b5dcad-2b6f-4cf1-93ef-7e3748fde4cc -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json.tmp.pid_1465_82aa0069-a027-4475-be22-dcd0977f4a30 -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttir.tmp.pid_1463_7efde184-dd6f-445f-bbbe-be559b7db26f -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttgir.tmp.pid_1463_e4bcbad9-34eb-4fe8-a5ef-990efcb22576 -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttgir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttir.tmp.pid_1465_0fac9c6a-a91a-4b3c-ba58-42e239e93e9f -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttgir.tmp.pid_1465_bc0f6074-6542-4b19-a7e6-0d09e123720f -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttgir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.llir.tmp.pid_1463_1c8ec382-65e8-4066-b82c-ccdfec37d317 -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.llir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ptx.tmp.pid_1463_a6b78858-ec93-4366-b195-adadf53f3575 -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ptx
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.llir.tmp.pid_1465_6ca938da-2740-4e83-9180-7f9e2199d222 -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.llir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ptx.tmp.pid_1465_abae5c09-5d18-4d8b-ad58-52b30fc1b2b9 -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ptx
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.cubin.tmp.pid_1463_058654c1-bc84-4853-84e0-46b42ac6931c -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.cubin
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json.tmp.pid_1463_eeb3c09f-052e-422e-ad6e-981dc43232d4 -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json.tmp.pid_1463_10e1f3f6-e8af-4ce6-b46a-68061ee4355c -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.cubin.tmp.pid_1465_9e493fbf-9a1f-4b56-b7b6-261a2b6bead1 -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.cubin
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json.tmp.pid_1465_49237752-09ac-450d-b429-970e476f09af -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json.tmp.pid_1465_75cf45bf-0969-43d4-b9f1-c2107e319764 -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json
run with sudo
(tts) koma@DESKTOP-O5IAPM2:~/fish-speech$ sudo /home/koma/.conda/envs/tts/bin/python -m tools.api_server --listen 0.0.0.0:7865 --llama-checkpoint-path checkpoints/fish-speech-1.5 --decoder-checkpoint-path checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth --decoder-config-name firefly_gan_vq --compile
INFO: Started server process [8528]
INFO: Waiting for application startup.
2025-03-23 10:45:56.124 | INFO | fish_speech.models.text2semantic.inference:load_model:681 - Restored model from checkpoint
2025-03-23 10:45:56.124 | INFO | fish_speech.models.text2semantic.inference:load_model:687 - Using DualARTransformer
2025-03-23 10:45:56.124 | INFO | fish_speech.models.text2semantic.inference:load_model:695 - Compiling function...
2025-03-23 10:45:56.842 | INFO | tools.server.model_manager:load_llama_model:99 - LLAMA model loaded.
Traceback (most recent call last):
File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/torch/_inductor/compile_worker/__main__.py", line 7, in <module>
from torch._inductor.async_compile import pre_fork_setup
File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/torch/__init__.py", line 2263, in <module>
_logging._init_logs()
File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/torch/_logging/_internal.py", line 884, in _init_logs
_update_log_state_from_env()
File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/torch/_logging/_internal.py", line 716, in _update_log_state_from_env
log_state = _parse_log_settings(log_setting)
File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/torch/_logging/_internal.py", line 660, in _parse_log_settings
raise ValueError(_invalid_settings_err_msg(settings))
ValueError:
Invalid log settings: torch._dynamo=DEBUG, must be a comma separated list of fully
qualified module names, registered log names or registered artifact names.
For more info on various settings, try TORCH_LOGS="help"
Valid settings:
all, dynamo, aot, autograd, inductor, dynamic, torch, distributed, c10d, ddp, pp, fsdp, onnx, export, aot_graphs, graph_sizes, bytecode, graph_code, not_implemented, custom_format_test_artifact, graph_breaks, cudagraphs, kernel_code, fusion, recompiles, output_code, onnx_diagnostics, recompiles_verbose, trace_bytecode, compiled_autograd, schedule, trace_source, overlap, perf_hints, trace_call, sym_node, ddp_graphs, verbose_guards, graph, compiled_autograd_verbose, guards, aot_joint_graph, post_grad_graphs
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:445: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:630: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/finite_scalar_quantization.py:147: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/lookup_free_quantization.py:209: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
2025-03-23 10:45:57.498 | INFO | fish_speech.models.vqgan.inference:load_model:46 - Loaded model: <All keys matched successfully>
2025-03-23 10:45:57.499 | INFO | tools.server.model_manager:load_decoder_model:107 - Decoder model loaded.
2025-03-23 10:45:57.511 | INFO | fish_speech.models.text2semantic.inference:generate_long:788 - Encoded text: Hello world.
2025-03-23 10:45:57.511 | INFO | fish_speech.models.text2semantic.inference:generate_long:806 - Generating sentence 1/1 of sample 1/1
0%| | 0/1023 [00:00<?, ?it/s]/home/koma/.conda/envs/tts/lib/python3.10/contextlib.py:103: FutureWarning: `torch.backends.cuda.sdp_kernel()` is deprecated. In the future, this context manager will be removed. Please see `torch.nn.attention.sdpa_kernel()` for the new context manager, with updated signature.
self.gen = func(*args, **kwds)
W0323 10:46:39.124000 139948787234368 torch/fx/experimental/symbolic_shapes.py:4449] [0/0] xindex is not in var_ranges, defaulting to unknown range.
Segmentation fault (core dumped) /tmp/torchinductor_root/q7/.8528.139948787234368.tmp -> /tmp/torchinductor_root/q7/cq7aqs2ot34rpqjm36euezlogdt6eptsfb2ihhipmgx4f3prrecf.py
0%| | 0/1023 [00:47<?, ?it/s]
ERROR: Traceback (most recent call last):
File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/kui/asgi/lifespan.py", line 36, in __call__
await result
File "/home/koma/fish-speech/tools/api_server.py", line 100, in initialize_app
app.state.model_manager = ModelManager(
File "/home/koma/fish-speech/tools/server/model_manager.py", line 65, in __init__
self.warm_up(self.tts_inference_engine)
File "/home/koma/fish-speech/tools/server/model_manager.py", line 121, in warm_up
list(inference(request, tts_inference_engine))
File "/home/koma/fish-speech/tools/server/inference.py", line 36, in inference_wrapper
raise HTTPException(
baize.exceptions.HTTPException: (500, '\'backend=\\\'inductor\\\' raised:\\nPermissionError: [Errno 13] Permission denied: \\\'/tmp/torchinductor_root/q7/.8528.139948787234368.tmp\\\' -> \\\'/tmp/torchinductor_root/q7/cq7aqs2ot34rpqjm36euezlogdt6eptsfb2ihhipmgx4f3prrecf.py\\\'\\n\\nSet TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information\\n\'')
ERROR: Application startup failed. Exiting.
This looks like a conflict between torch and Windows? I'll try the same version to see if it is a common problem. Could you tell me more info about CUDA version, like is your CUDA version 535 or 550 etc.?
这看起来像是 torch 和 Windows 之间的冲突?我将尝试相同的版本,看看它是否是一个常见问题。您能否告诉我有关 CUDA 版本的更多信息,例如您的 CUDA 版本 535 还是 550 等?
I have encountered a similar situation. How to repair it?
python -m tools.api_server --listen 0.0.0.0:7865 --llama-checkpoint-path checkpoints/fish-speech-1.5 --decoder-checkpoint-path checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth --decoder-config-name firefly_gan_vq --compile
INFO: Started server process [552993]
INFO: Waiting for application startup.
2025-04-08 11:54:02.594 | INFO | fish_speech.models.text2semantic.inference:load_model:678 - Restored model from checkpoint
2025-04-08 11:54:02.594 | INFO | fish_speech.models.text2semantic.inference:load_model:684 - Using DualARTransformer
2025-04-08 11:54:02.594 | INFO | fish_speech.models.text2semantic.inference:load_model:692 - Compiling function...
2025-04-08 11:54:02.613 | INFO | tools.server.model_manager:load_llama_model:99 - LLAMA model loaded.
/root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:445: FutureWarning: torch.cuda.amp.autocast(args...) is deprecated. Please use torch.amp.autocast('cuda', args...) instead.
@autocast(enabled = False)
/root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:630: FutureWarning: torch.cuda.amp.autocast(args...) is deprecated. Please use torch.amp.autocast('cuda', args...) instead.
@autocast(enabled = False)
/root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/vector_quantize_pytorch/finite_scalar_quantization.py:147: FutureWarning: torch.cuda.amp.autocast(args...) is deprecated. Please use torch.amp.autocast('cuda', args...) instead.
@autocast(enabled = False)
/root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/vector_quantize_pytorch/lookup_free_quantization.py:209: FutureWarning: torch.cuda.amp.autocast(args...) is deprecated. Please use torch.amp.autocast('cuda', args...) instead.
@autocast(enabled = False)
2025-04-08 11:54:03.587 | INFO | fish_speech.models.vqgan.inference:load_model:46 - Loaded model: <All keys matched successfully>
2025-04-08 11:54:03.588 | INFO | tools.server.model_manager:load_decoder_model:107 - Decoder model loaded.
2025-04-08 11:54:03.598 | INFO | fish_speech.models.text2semantic.inference:generate_long:785 - Encoded text: Hello world.
2025-04-08 11:54:03.598 | INFO | fish_speech.models.text2semantic.inference:generate_long:803 - Generating sentence 1/1 of sample 1/1
0%| | 0/1023 [00:00<?, ?it/s]/root/miniconda3/envs/fish-speechv2/lib/python3.10/contextlib.py:103: FutureWarning: torch.backends.cuda.sdp_kernel() is deprecated. In the future, this context manager will be removed. Please see torch.nn.attention.sdpa_kernel() for the new context manager, with updated signature.
self.gen = func(*args, **kwds)
V0408 11:54:05.397000 552993 site-packages/torch/_dynamo/convert_frame.py:1345] skipping: _is_skip_guard_eval_unsafe_stance (reason: in skipfiles, file: /root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py)
I0408 11:54:05.399000 552993 site-packages/torch/_dynamo/utils.py:1162] [0/0] ChromiumEventLogger initialized with id 08d0dda0-46f9-477c-8710-febad32f5c11
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] torchdynamo start compiling decode_one_token_ar /bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py:249, stack (elided 4 frames):
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/root/miniconda3/envs/fish-speechv2/lib/python3.10/threading.py", line 973, in _bootstrap
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] self._bootstrap_inner()
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/root/miniconda3/envs/fish-speechv2/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] self.run()
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/root/miniconda3/envs/fish-speechv2/lib/python3.10/threading.py", line 953, in run
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] self._target(*self._args, **self._kwargs)
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py", line 928, in worker
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] for chunk in generate_long(
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py", line 837, in generate_long
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] y = generate(
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] return func(*args, **kwargs)
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] return func(*args, **kwargs)
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py", line 458, in generate
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] x = decode_n_tokens(
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py", line 378, in decode_n_tokens
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] next_token = decode_one_token(
V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0]
I0408 11:54:05.403000 552993 site-packages/torch/_dynamo/symbolic_convert.py:2706] [0/0] Step 1: torchdynamo start tracing decode_one_token_ar /bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py:249
I0408 11:54:05.403000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:3192] [0/0] create_env
V0408 11:54:05.406000 552993 site-packages/torch/_dynamo/symbolic_convert.py:932] [0/0] [__trace_source] TRACE starts_line /bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py:258 in decode_one_token_ar (decode_one_token_ar)
V0408 11:54:05.406000 552993 site-packages/torch/_dynamo/symbolic_convert.py:932] [0/0] [__trace_source] torch.compiler.cudagraph_mark_step_begin()
V0408 11:54:05.425000 552993 site-packages/torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_GLOBAL torch []
V0408 11:54:05.426000 552993 site-packages/torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_ATTR compiler [PythonModuleVariable(<module 'torch' from '/root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/torch/init.py'>)]
V0408 11:54:05.426000 552993 site-packages/torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_ATTR cudagraph_mark_step_begin [PythonModuleVariable(<module 'torch.compiler' from '/root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/torch/compiler/init.py'>)]
V0408 11:54:05.428000 552993 site-packages/torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE CALL_FUNCTION 0 [SkipFunctionVariable()]
V0408 11:54:05.428000 552993 site-packages/torch/_dynamo/symbolic_convert.py:973] [0/0] empty checkpoint
0%| | 0/1023 [00:00<?, ?it/s]
ERROR: Traceback (most recent call last):
File "/root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/kui/asgi/lifespan.py", line 36, in call
await result
File "/bigdata/xiaozhi/tts/fish-speechv2/tools/api_server.py", line 83, in initialize_app
app.state.model_manager = ModelManager(
File "/bigdata/xiaozhi/tts/fish-speechv2/tools/server/model_manager.py", line 65, in init
self.warm_up(self.tts_inference_engine)
File "/bigdata/xiaozhi/tts/fish-speechv2/tools/server/model_manager.py", line 121, in warm_up
list(inference(request, tts_inference_engine))
File "/bigdata/xiaozhi/tts/fish-speechv2/tools/server/inference.py", line 25, in inference_wrapper
raise HTTPException(
baize.exceptions.HTTPException: (500, ''\'skip function cudagraph_mark_step_begin in file /root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/torch/compiler/init.py\'\n\nfrom user code:\n File "/bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py", line 258, in decode_one_token_ar\n torch.compiler.cudagraph_mark_step_begin()\n\n\nYou can suppress this exception and fall back to eager by setting:\n import torch._dynamo\n torch._dynamo.config.suppress_errors = True\n'')
ERROR: Application startup failed. Exiting. I0408 11:54:05.436000 552993 site-packages/torch/_dynamo/eval_frame.py:398] TorchDynamo attempted to trace the following frames: [ I0408 11:54:05.436000 552993 site-packages/torch/_dynamo/eval_frame.py:398] * decode_one_token_ar /bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py:249 I0408 11:54:05.436000 552993 site-packages/torch/_dynamo/eval_frame.py:398] ] I0408 11:54:05.437000 552993 site-packages/torch/_dynamo/utils.py:446] TorchDynamo compilation metrics: I0408 11:54:05.437000 552993 site-packages/torch/_dynamo/utils.py:446] Function, Runtimes (s) I0408 11:54:05.437000 552993 site-packages/torch/_dynamo/utils.py:446] _compile.compile_inner, 0.0277 V0408 11:54:05.437000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats constrain_symbol_range: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.437000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats defer_runtime_assert: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:05.437000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats evaluate_expr: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:05.438000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _simplify_floor_div: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.438000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _maybe_guard_rel: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:05.438000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _find: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.438000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats has_hint: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:05.439000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats size_hint: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:05.439000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats simplify: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.439000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _update_divisible: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.439000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats replace: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.439000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _maybe_evaluate_static: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.440000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats get_implications: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.440000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats get_axioms: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.440000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _maybe_evaluate_static_worker: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.440000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats safe_expand: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:05.441000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats uninteresting_files: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) I0408 11:54:06.738000 553124 site-packages/torch/_dynamo/eval_frame.py:398] TorchDynamo attempted to trace the following frames: [ I0408 11:54:06.738000 553124 site-packages/torch/_dynamo/eval_frame.py:398] I0408 11:54:06.738000 553124 site-packages/torch/_dynamo/eval_frame.py:398] ] I0408 11:54:06.739000 553124 site-packages/torch/_dynamo/utils.py:446] TorchDynamo compilation metrics: I0408 11:54:06.739000 553124 site-packages/torch/_dynamo/utils.py:446] Function, Runtimes (s) V0408 11:54:06.739000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats constrain_symbol_range: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.739000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats defer_runtime_assert: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:06.739000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats evaluate_expr: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:06.739000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _simplify_floor_div: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.739000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _maybe_guard_rel: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:06.740000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _find: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.740000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats has_hint: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:06.740000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats size_hint: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:06.740000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats simplify: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.740000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _update_divisible: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.740000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats replace: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.740000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _maybe_evaluate_static: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.740000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats get_implications: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.741000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats get_axioms: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.741000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _maybe_evaluate_static_worker: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.741000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats safe_expand: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:06.741000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats uninteresting_files: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0)
+1 also getting this error. I will try resolve it is since I installed Triton and when doing --compile
INFO: Started server process [3280]
INFO: Waiting for application startup.
2025-04-18 23:04:36.004 | INFO | fish_speech.models.text2semantic.inference:load_model:681 - Restored model from checkpoint
2025-04-18 23:04:36.005 | INFO | fish_speech.models.text2semantic.inference:load_model:687 - Using DualARTransformer
2025-04-18 23:04:36.006 | INFO | fish_speech.models.text2semantic.inference:load_model:695 - Compiling function...
2025-04-18 23:04:36.207 | INFO | tools.server.model_manager:load_llama_model:99 - LLAMA model loaded.
D:\Python\Python311\Lib\site-packages\vector_quantize_pytorch\vector_quantize_pytorch.py:445: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
D:\Python\Python311\Lib\site-packages\vector_quantize_pytorch\vector_quantize_pytorch.py:630: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
D:\Python\Python311\Lib\site-packages\vector_quantize_pytorch\finite_scalar_quantization.py:147: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
D:\Python\Python311\Lib\site-packages\vector_quantize_pytorch\lookup_free_quantization.py:209: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
@autocast(enabled = False)
2025-04-18 23:04:40.237 | INFO | fish_speech.models.vqgan.inference:load_model:46 - Loaded model: <All keys matched successfully>
2025-04-18 23:04:40.239 | INFO | tools.server.model_manager:load_decoder_model:107 - Decoder model loaded.
2025-04-18 23:04:40.263 | INFO | fish_speech.models.text2semantic.inference:generate_long:788 - Encoded text: Hello world.
2025-04-18 23:04:40.265 | INFO | fish_speech.models.text2semantic.inference:generate_long:806 - Generating sentence 1/1 of sample 1/1
0%| | 0/1023 [00:00<?, ?it/s]D:\Python\Python311\Lib\contextlib.py:105: FutureWarning: `torch.backends.cuda.sdp_kernel()` is deprecated. In the future, this context manager will be removed. Please see `torch.nn.attention.sdpa_kernel()` for the new context manager, with updated signature.
self.gen = func(*args, **kwds)
0%| | 0/1023 [01:07<?, ?it/s]
ERROR: Traceback (most recent call last):
File "D:\Python\Python311\Lib\site-packages\kui\asgi\lifespan.py", line 36, in __call__
await result
File "D:\2025\Call Center Agent X\fish-speech\tools\api_server.py", line 83, in initialize_app
app.state.model_manager = ModelManager(
^^^^^^^^^^^^^
File "D:\2025\Call Center Agent X\fish-speech\tools\server\model_manager.py", line 65, in __init__
self.warm_up(self.tts_inference_engine)
File "D:\2025\Call Center Agent X\fish-speech\tools\server\model_manager.py", line 121, in warm_up
list(inference(request, tts_inference_engine))
File "D:\2025\Call Center Agent X\fish-speech\tools\server\inference.py", line 25, in inference_wrapper
raise HTTPException(
baize.exceptions.HTTPException: (500, '\'backend=\\\'inductor\\\' raised:\\nPermissionError: [WinError 5] Access is denied: \\\'D:\\\\\\\\AppData\\\\\\\\Local\\\\\\\\Temp\\\\\\\\torchinductor_BabaWawa\\\\\\\\triton\\\\\\\\0\\\\\\\\tmp.eb6dec9b-548d-40f1-8919-66d91d1cf9cc\\\' -> \\\'D:\\\\\\\\AppData\\\\\\\\Local\\\\\\\\Temp\\\\\\\\torchinductor_BabaWawa\\\\\\\\triton\\\\\\\\0\\\\\\\\-ku0Qh8fVikssLWtdtHdeJHDiOsVPM7lsTZaqvCif38\\\'\\n\\nSet TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information\\n\\n\\nYou can suppress this exception and fall back to eager by setting:\\n import torch._dynamo\\n torch._dynamo.config.suppress_errors = True\\n\'')
ERROR: Application startup failed. Exiting.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.