PermissionError and Segmentation Fault with torch.compile (Inductor) during Model Warm-up when --complie is set

Open MaxRubby opened this issue 9 months ago • 4 comments

Self Checks

[x] This template is only for bug reports. For questions, please visit Discussions.
[x] I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文日本語 Portuguese (Brazil)
[x] I have searched for existing issues, including closed ones. Search issues
[x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[x] [FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
[x] Please do not modify this template and fill in all required fields.

Cloud or Self Hosted

Self Hosted (Source)

Environment Details

OS: WSL Ubuntu 22.04
Python Version: 3.10.16
PyTorch Version: * 2.4.1+cu121
CUDA Version: nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2021 NVIDIA Corporation Built on Thu_Nov_18_09:45:30_PST_2021 Cuda compilation tools, release 11.5, V11.5.119 Build cuda_11.5.r11.5/compiler.30672275_0
GPU: NVIDIA GeForce RTX 3060
Fish-Speech Version: 1.5
torch version: 2.4.1
torchvision version : 0.19.1+cu121
torchaudio version 0.19.1+cu121
torchtext version: 2.4.1+cu121

Steps to Reproduce

I started up successfully before with --complie and it works fine but dont why this doesnt work this time:

Set up the fish-speech environment (including dependencies).

Run the API server with the --compile flag:

python -m tools.api_server \
    --listen 0.0.0.0:7865 \
    --llama-checkpoint-path checkpoints/fish-speech-1.5 \
    --decoder-checkpoint-path checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth \
    --decoder-config-name firefly_gan_vq \
    --compile

✔️ Expected Behavior

Application start up successfully

❌ Actual Behavior

I'm encountering a PermissionError followed by a Segmentation Fault when running a TTS model (fish-speech) with torch.compile using the Inductor backend. The error occurs during the model warm-up phase (model_manager.warm_up). The issue seems related to file access permissions in the temporary directory used by TorchInductor, even after attempts to adjust permissions and use shutil.move as a workaround which referred from https://github.com/unslothai/unsloth/issues/1999.

Error Log:

run as user:

(tts) koma@LAPTOP-UFED71OD:~/fish-speech$ python -m tools.api_server --listen 0.0.0.0:7865 --llama-checkpoint-path checkpoints/fish-speech-1.5 --decoder-checkpoint-path checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth --decoder-config-name firefly_gan_vq --compile
INFO:     Started server process [1330]
INFO:     Waiting for application startup.
2025-03-23 11:28:20.578 | INFO     | fish_speech.models.text2semantic.inference:load_model:681 - Restored model from checkpoint
2025-03-23 11:28:20.578 | INFO     | fish_speech.models.text2semantic.inference:load_model:687 - Using DualARTransformer
2025-03-23 11:28:20.578 | INFO     | fish_speech.models.text2semantic.inference:load_model:695 - Compiling function...
2025-03-23 11:28:22.062 | INFO     | tools.server.model_manager:load_llama_model:99 - LLAMA model loaded.
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:445: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:630: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/finite_scalar_quantization.py:147: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/lookup_free_quantization.py:209: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
2025-03-23 11:28:27.765 | INFO     | fish_speech.models.vqgan.inference:load_model:46 - Loaded model: <All keys matched successfully>
2025-03-23 11:28:27.766 | INFO     | tools.server.model_manager:load_decoder_model:107 - Decoder model loaded.
2025-03-23 11:28:27.795 | INFO     | fish_speech.models.text2semantic.inference:generate_long:788 - Encoded text: Hello world.
2025-03-23 11:28:27.797 | INFO     | fish_speech.models.text2semantic.inference:generate_long:806 - Generating sentence 1/1 of sample 1/1
  0%|                                                                                  | 0/1023 [00:00<?, ?it/s]/home/koma/.conda/envs/tts/lib/python3.10/contextlib.py:103: FutureWarning: `torch.backends.cuda.sdp_kernel()` is deprecated. In the future, this context manager will be removed. Please see `torch.nn.attention.sdpa_kernel()` for the new context manager, with updated signature.
  self.gen = func(*args, **kwds)
W0323 11:30:52.749000 140116945266240 torch/fx/experimental/symbolic_shapes.py:4449] [0/0] xindex is not in var_ranges, defaulting to unknown range.
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ttir.tmp.pid_1465_a4aea7d5-c578-456b-a4e9-f2c3298911a3 -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ttir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ttgir.tmp.pid_1465_15dc99d5-a380-438c-b1da-c05e1e98490f -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ttgir
Segmentation fault (core dumped) /tmp/torchinductor_koma/lo/.1330.140116945266240.tmp -> /tmp/torchinductor_koma/lo/clogj3r7bsakyus6wz3yqefmjlhto65qemf2qfe4ns7mp52pxd6n.py
  0%|                                                                                  | 0/1023 [02:41<?, ?it/s]
ERROR:    Traceback (most recent call last):
  File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/kui/asgi/lifespan.py", line 36, in __call__
    await result
  File "/home/koma/fish-speech/tools/api_server.py", line 82, in initialize_app
    app.state.model_manager = ModelManager(
  File "/home/koma/fish-speech/tools/server/model_manager.py", line 65, in __init__
    self.warm_up(self.tts_inference_engine)
  File "/home/koma/fish-speech/tools/server/model_manager.py", line 121, in warm_up
    list(inference(request, tts_inference_engine))
  File "/home/koma/fish-speech/tools/server/inference.py", line 25, in inference_wrapper
    raise HTTPException(
baize.exceptions.HTTPException: (500, '\'backend=\\\'inductor\\\' raised:\\nPermissionError: [Errno 13] Permission denied: \\\'/tmp/torchinductor_koma/lo/.1330.140116945266240.tmp\\\' -> \\\'/tmp/torchinductor_koma/lo/clogj3r7bsakyus6wz3yqefmjlhto65qemf2qfe4ns7mp52pxd6n.py\\\'\\n\\nSet TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information\\n\\n\\nYou can suppress this exception and fall back to eager by setting:\\n    import torch._dynamo\\n    torch._dynamo.config.suppress_errors = True\\n\'')

Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.llir.tmp.pid_1465_7cc0f06c-2609-4d20-a45e-ae625a161c39 -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.llir
ERROR:    Application startup failed. Exiting.
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ptx.tmp.pid_1465_aaab7612-5f9d-4669-bf5e-1f685a9eb20a -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.ptx
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.cubin.tmp.pid_1465_7e66aa04-7c5f-4624-a304-84f1c6297651 -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.cubin
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.json.tmp.pid_1465_c426a30f-3d7c-4382-ae50-3281c8bfc682 -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/__grp__triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.json.tmp.pid_1465_3c4ed8ff-5f40-46aa-bad2-06ea8a68ff97 -> /tmp/torchinductor_koma/triton/0/e6989009f1faa05ac5f7f5257fe2b85e8fead7015504203974aa7e1e4d122d7c/__grp__triton_red_fused__softmax_add_bmm_index_logical_not_masked_fill_zeros_like_8.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttir.tmp.pid_1465_8b7a4a8b-f4fe-4a5d-b211-dcb7b802a116 -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttgir.tmp.pid_1465_09c6789f-443a-4516-8364-6ed1dca9a77e -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttgir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttir.tmp.pid_1463_63dde67e-2ce1-4689-9dd6-201faa09fbbd -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttgir.tmp.pid_1463_937aec16-747f-4c7d-ad14-03e631c2c8fb -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttgir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.llir.tmp.pid_1465_c3a159f6-b382-4c1a-a151-ad6ec86f6930 -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.llir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ptx.tmp.pid_1465_b79ce8db-924e-43b3-8d93-efc4e1b15264 -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ptx
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.llir.tmp.pid_1463_ab90bde9-214a-4ff5-a32a-3d6a7ef50ced -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.llir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ptx.tmp.pid_1463_7a0e89a4-53ff-4fd9-835b-1a8e88d92ee6 -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ptx
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.cubin.tmp.pid_1463_2b7ccc2a-537c-4bc7-9c26-1428a6db4caa -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.cubin
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json.tmp.pid_1463_f222d114-66c1-4eca-a03a-0bb33e9bd35d -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json.tmp.pid_1463_739c28d6-8df0-4417-b096-a23454ab5497 -> /tmp/torchinductor_koma/triton/0/3874684ef2364a59aab56e08e3ed32022564fa8382c2943dc87961046f181f38/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.cubin.tmp.pid_1465_65ecf108-ed86-4aca-83ff-afb63f98f017 -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.cubin
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json.tmp.pid_1465_87b5dcad-2b6f-4cf1-93ef-7e3748fde4cc -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json.tmp.pid_1465_82aa0069-a027-4475-be22-dcd0977f4a30 -> /tmp/torchinductor_koma/triton/0/bb6ccef264892b035b94d267793ba6f1e10d66cfd971c9abaca345ffab4a88d0/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttir.tmp.pid_1463_7efde184-dd6f-445f-bbbe-be559b7db26f -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttgir.tmp.pid_1463_e4bcbad9-34eb-4fe8-a5ef-990efcb22576 -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ttgir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttir.tmp.pid_1465_0fac9c6a-a91a-4b3c-ba58-42e239e93e9f -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttgir.tmp.pid_1465_bc0f6074-6542-4b19-a7e6-0d09e123720f -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ttgir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.llir.tmp.pid_1463_1c8ec382-65e8-4066-b82c-ccdfec37d317 -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.llir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ptx.tmp.pid_1463_a6b78858-ec93-4366-b195-adadf53f3575 -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.ptx
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.llir.tmp.pid_1465_6ca938da-2740-4e83-9180-7f9e2199d222 -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.llir
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ptx.tmp.pid_1465_abae5c09-5d18-4d8b-ad58-52b30fc1b2b9 -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.ptx
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.cubin.tmp.pid_1463_058654c1-bc84-4853-84e0-46b42ac6931c -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.cubin
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json.tmp.pid_1463_eeb3c09f-052e-422e-ad6e-981dc43232d4 -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json.tmp.pid_1463_10e1f3f6-e8af-4ce6-b46a-68061ee4355c -> /tmp/torchinductor_koma/triton/0/1cc3eba260b69750ae0548a7dc165b3318610eb430423fac402263c1ac4d2109/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_46.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.cubin.tmp.pid_1465_9e493fbf-9a1f-4b56-b7b6-261a2b6bead1 -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.cubin
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json.tmp.pid_1465_49237752-09ac-450d-b429-970e476f09af -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json
Segmentation fault (core dumped) /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json.tmp.pid_1465_75cf45bf-0969-43d4-b9f1-c2107e319764 -> /tmp/torchinductor_koma/triton/0/edb0032f3eae2bf52320059fa36d9885afbf10dfab09fe554572050365c84131/__grp__triton_per_fused__softmax_add_index_logical_not_masked_fill_mul_zeros_41.json

run with sudo

(tts) koma@DESKTOP-O5IAPM2:~/fish-speech$ sudo /home/koma/.conda/envs/tts/bin/python  -m tools.api_server --listen 0.0.0.0:7865 --llama-checkpoint-path checkpoints/fish-speech-1.5 --decoder-checkpoint-path checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth --decoder-config-name firefly_gan_vq --compile
INFO:     Started server process [8528]
INFO:     Waiting for application startup.
2025-03-23 10:45:56.124 | INFO     | fish_speech.models.text2semantic.inference:load_model:681 - Restored model from checkpoint
2025-03-23 10:45:56.124 | INFO     | fish_speech.models.text2semantic.inference:load_model:687 - Using DualARTransformer
2025-03-23 10:45:56.124 | INFO     | fish_speech.models.text2semantic.inference:load_model:695 - Compiling function...
2025-03-23 10:45:56.842 | INFO     | tools.server.model_manager:load_llama_model:99 - LLAMA model loaded.
Traceback (most recent call last):
  File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/torch/_inductor/compile_worker/__main__.py", line 7, in <module>
    from torch._inductor.async_compile import pre_fork_setup
  File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/torch/__init__.py", line 2263, in <module>
    _logging._init_logs()
  File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/torch/_logging/_internal.py", line 884, in _init_logs
    _update_log_state_from_env()
  File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/torch/_logging/_internal.py", line 716, in _update_log_state_from_env
    log_state = _parse_log_settings(log_setting)
  File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/torch/_logging/_internal.py", line 660, in _parse_log_settings
    raise ValueError(_invalid_settings_err_msg(settings))
ValueError:
Invalid log settings: torch._dynamo=DEBUG, must be a comma separated list of fully
qualified module names, registered log names or registered artifact names.
For more info on various settings, try TORCH_LOGS="help"
Valid settings:
all, dynamo, aot, autograd, inductor, dynamic, torch, distributed, c10d, ddp, pp, fsdp, onnx, export, aot_graphs, graph_sizes, bytecode, graph_code, not_implemented, custom_format_test_artifact, graph_breaks, cudagraphs, kernel_code, fusion, recompiles, output_code, onnx_diagnostics, recompiles_verbose, trace_bytecode, compiled_autograd, schedule, trace_source, overlap, perf_hints, trace_call, sym_node, ddp_graphs, verbose_guards, graph, compiled_autograd_verbose, guards, aot_joint_graph, post_grad_graphs

/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:445: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:630: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/finite_scalar_quantization.py:147: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/home/koma/.conda/envs/tts/lib/python3.10/site-packages/vector_quantize_pytorch/lookup_free_quantization.py:209: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
2025-03-23 10:45:57.498 | INFO     | fish_speech.models.vqgan.inference:load_model:46 - Loaded model: <All keys matched successfully>
2025-03-23 10:45:57.499 | INFO     | tools.server.model_manager:load_decoder_model:107 - Decoder model loaded.
2025-03-23 10:45:57.511 | INFO     | fish_speech.models.text2semantic.inference:generate_long:788 - Encoded text: Hello world.
2025-03-23 10:45:57.511 | INFO     | fish_speech.models.text2semantic.inference:generate_long:806 - Generating sentence 1/1 of sample 1/1
  0%|                                                                                                                                                              | 0/1023 [00:00<?, ?it/s]/home/koma/.conda/envs/tts/lib/python3.10/contextlib.py:103: FutureWarning: `torch.backends.cuda.sdp_kernel()` is deprecated. In the future, this context manager will be removed. Please see `torch.nn.attention.sdpa_kernel()` for the new context manager, with updated signature.
  self.gen = func(*args, **kwds)
W0323 10:46:39.124000 139948787234368 torch/fx/experimental/symbolic_shapes.py:4449] [0/0] xindex is not in var_ranges, defaulting to unknown range.
Segmentation fault (core dumped) /tmp/torchinductor_root/q7/.8528.139948787234368.tmp -> /tmp/torchinductor_root/q7/cq7aqs2ot34rpqjm36euezlogdt6eptsfb2ihhipmgx4f3prrecf.py
  0%|                                                                                                                                                              | 0/1023 [00:47<?, ?it/s]
ERROR:    Traceback (most recent call last):
  File "/home/koma/.conda/envs/tts/lib/python3.10/site-packages/kui/asgi/lifespan.py", line 36, in __call__
    await result
  File "/home/koma/fish-speech/tools/api_server.py", line 100, in initialize_app
    app.state.model_manager = ModelManager(
  File "/home/koma/fish-speech/tools/server/model_manager.py", line 65, in __init__
    self.warm_up(self.tts_inference_engine)
  File "/home/koma/fish-speech/tools/server/model_manager.py", line 121, in warm_up
    list(inference(request, tts_inference_engine))
  File "/home/koma/fish-speech/tools/server/inference.py", line 36, in inference_wrapper
    raise HTTPException(
baize.exceptions.HTTPException: (500, '\'backend=\\\'inductor\\\' raised:\\nPermissionError: [Errno 13] Permission denied: \\\'/tmp/torchinductor_root/q7/.8528.139948787234368.tmp\\\' -> \\\'/tmp/torchinductor_root/q7/cq7aqs2ot34rpqjm36euezlogdt6eptsfb2ihhipmgx4f3prrecf.py\\\'\\n\\nSet TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information\\n\'')

ERROR:    Application startup failed. Exiting.

Mar 23 '25 03:03 MaxRubby

This looks like a conflict between torch and Windows? I'll try the same version to see if it is a common problem. Could you tell me more info about CUDA version, like is your CUDA version 535 or 550 etc.?

Mar 23 '25 18:03 Whale-Dolphin

这看起来像是 torch 和 Windows 之间的冲突？我将尝试相同的版本，看看它是否是一个常见问题。您能否告诉我有关 CUDA 版本的更多信息，例如您的 CUDA 版本 535 还是 550 等？

I have encountered a similar situation. How to repair it?

python -m tools.api_server --listen 0.0.0.0:7865 --llama-checkpoint-path checkpoints/fish-speech-1.5 --decoder-checkpoint-path checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth --decoder-config-name firefly_gan_vq --compile INFO: Started server process [552993] INFO: Waiting for application startup. 2025-04-08 11:54:02.594 | INFO | fish_speech.models.text2semantic.inference:load_model:678 - Restored model from checkpoint 2025-04-08 11:54:02.594 | INFO | fish_speech.models.text2semantic.inference:load_model:684 - Using DualARTransformer 2025-04-08 11:54:02.594 | INFO | fish_speech.models.text2semantic.inference:load_model:692 - Compiling function... 2025-04-08 11:54:02.613 | INFO | tools.server.model_manager:load_llama_model:99 - LLAMA model loaded. /root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:445: FutureWarning: torch.cuda.amp.autocast(args...) is deprecated. Please use torch.amp.autocast('cuda', args...) instead. @autocast(enabled = False) /root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:630: FutureWarning: torch.cuda.amp.autocast(args...) is deprecated. Please use torch.amp.autocast('cuda', args...) instead. @autocast(enabled = False) /root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/vector_quantize_pytorch/finite_scalar_quantization.py:147: FutureWarning: torch.cuda.amp.autocast(args...) is deprecated. Please use torch.amp.autocast('cuda', args...) instead. @autocast(enabled = False) /root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/vector_quantize_pytorch/lookup_free_quantization.py:209: FutureWarning: torch.cuda.amp.autocast(args...) is deprecated. Please use torch.amp.autocast('cuda', args...) instead. @autocast(enabled = False) 2025-04-08 11:54:03.587 | INFO | fish_speech.models.vqgan.inference:load_model:46 - Loaded model: <All keys matched successfully> 2025-04-08 11:54:03.588 | INFO | tools.server.model_manager:load_decoder_model:107 - Decoder model loaded. 2025-04-08 11:54:03.598 | INFO | fish_speech.models.text2semantic.inference:generate_long:785 - Encoded text: Hello world. 2025-04-08 11:54:03.598 | INFO | fish_speech.models.text2semantic.inference:generate_long:803 - Generating sentence 1/1 of sample 1/1 0%| | 0/1023 [00:00<?, ?it/s]/root/miniconda3/envs/fish-speechv2/lib/python3.10/contextlib.py:103: FutureWarning: torch.backends.cuda.sdp_kernel() is deprecated. In the future, this context manager will be removed. Please see torch.nn.attention.sdpa_kernel() for the new context manager, with updated signature. self.gen = func(*args, **kwds) V0408 11:54:05.397000 552993 site-packages/torch/_dynamo/convert_frame.py:1345] skipping: _is_skip_guard_eval_unsafe_stance (reason: in skipfiles, file: /root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py) I0408 11:54:05.399000 552993 site-packages/torch/_dynamo/utils.py:1162] [0/0] ChromiumEventLogger initialized with id 08d0dda0-46f9-477c-8710-febad32f5c11 V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] torchdynamo start compiling decode_one_token_ar /bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py:249, stack (elided 4 frames): V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/root/miniconda3/envs/fish-speechv2/lib/python3.10/threading.py", line 973, in _bootstrap V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] self._bootstrap_inner() V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/root/miniconda3/envs/fish-speechv2/lib/python3.10/threading.py", line 1016, in _bootstrap_inner V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] self.run() V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/root/miniconda3/envs/fish-speechv2/lib/python3.10/threading.py", line 953, in run V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] self._target(*self._args, **self._kwargs) V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py", line 928, in worker V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] for chunk in generate_long( V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py", line 837, in generate_long V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] y = generate( V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] return func(*args, **kwargs) V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] return func(*args, **kwargs) V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py", line 458, in generate V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] x = decode_n_tokens( V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] File "/bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py", line 378, in decode_n_tokens V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] next_token = decode_one_token( V0408 11:54:05.400000 552993 site-packages/torch/_dynamo/convert_frame.py:930] [0/0] I0408 11:54:05.403000 552993 site-packages/torch/_dynamo/symbolic_convert.py:2706] [0/0] Step 1: torchdynamo start tracing decode_one_token_ar /bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py:249 I0408 11:54:05.403000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:3192] [0/0] create_env V0408 11:54:05.406000 552993 site-packages/torch/_dynamo/symbolic_convert.py:932] [0/0] [__trace_source] TRACE starts_line /bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py:258 in decode_one_token_ar (decode_one_token_ar) V0408 11:54:05.406000 552993 site-packages/torch/_dynamo/symbolic_convert.py:932] [0/0] [__trace_source] torch.compiler.cudagraph_mark_step_begin() V0408 11:54:05.425000 552993 site-packages/torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_GLOBAL torch [] V0408 11:54:05.426000 552993 site-packages/torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_ATTR compiler [PythonModuleVariable(<module 'torch' from '/root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/torch/init.py'>)] V0408 11:54:05.426000 552993 site-packages/torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE LOAD_ATTR cudagraph_mark_step_begin [PythonModuleVariable(<module 'torch.compiler' from '/root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/torch/compiler/init.py'>)] V0408 11:54:05.428000 552993 site-packages/torch/_dynamo/symbolic_convert.py:955] [0/0] [__trace_bytecode] TRACE CALL_FUNCTION 0 [SkipFunctionVariable()] V0408 11:54:05.428000 552993 site-packages/torch/_dynamo/symbolic_convert.py:973] [0/0] empty checkpoint 0%| | 0/1023 [00:00<?, ?it/s] ERROR: Traceback (most recent call last): File "/root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/kui/asgi/lifespan.py", line 36, in call await result File "/bigdata/xiaozhi/tts/fish-speechv2/tools/api_server.py", line 83, in initialize_app app.state.model_manager = ModelManager( File "/bigdata/xiaozhi/tts/fish-speechv2/tools/server/model_manager.py", line 65, in init self.warm_up(self.tts_inference_engine) File "/bigdata/xiaozhi/tts/fish-speechv2/tools/server/model_manager.py", line 121, in warm_up list(inference(request, tts_inference_engine)) File "/bigdata/xiaozhi/tts/fish-speechv2/tools/server/inference.py", line 25, in inference_wrapper raise HTTPException( baize.exceptions.HTTPException: (500, ''\'skip function cudagraph_mark_step_begin in file /root/miniconda3/envs/fish-speechv2/lib/python3.10/site-packages/torch/compiler/init.py\'\n\nfrom user code:\n File "/bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py", line 258, in decode_one_token_ar\n torch.compiler.cudagraph_mark_step_begin()\n\n\nYou can suppress this exception and fall back to eager by setting:\n import torch._dynamo\n torch._dynamo.config.suppress_errors = True\n'')

ERROR: Application startup failed. Exiting. I0408 11:54:05.436000 552993 site-packages/torch/_dynamo/eval_frame.py:398] TorchDynamo attempted to trace the following frames: [ I0408 11:54:05.436000 552993 site-packages/torch/_dynamo/eval_frame.py:398] * decode_one_token_ar /bigdata/xiaozhi/tts/fish-speechv2/fish_speech/models/text2semantic/inference.py:249 I0408 11:54:05.436000 552993 site-packages/torch/_dynamo/eval_frame.py:398] ] I0408 11:54:05.437000 552993 site-packages/torch/_dynamo/utils.py:446] TorchDynamo compilation metrics: I0408 11:54:05.437000 552993 site-packages/torch/_dynamo/utils.py:446] Function, Runtimes (s) I0408 11:54:05.437000 552993 site-packages/torch/_dynamo/utils.py:446] _compile.compile_inner, 0.0277 V0408 11:54:05.437000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats constrain_symbol_range: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.437000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats defer_runtime_assert: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:05.437000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats evaluate_expr: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:05.438000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _simplify_floor_div: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.438000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _maybe_guard_rel: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:05.438000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _find: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.438000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats has_hint: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:05.439000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats size_hint: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:05.439000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats simplify: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.439000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _update_divisible: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.439000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats replace: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.439000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _maybe_evaluate_static: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.440000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats get_implications: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.440000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats get_axioms: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.440000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _maybe_evaluate_static_worker: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:05.440000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats safe_expand: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:05.441000 552993 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats uninteresting_files: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) I0408 11:54:06.738000 553124 site-packages/torch/_dynamo/eval_frame.py:398] TorchDynamo attempted to trace the following frames: [ I0408 11:54:06.738000 553124 site-packages/torch/_dynamo/eval_frame.py:398] I0408 11:54:06.738000 553124 site-packages/torch/_dynamo/eval_frame.py:398] ] I0408 11:54:06.739000 553124 site-packages/torch/_dynamo/utils.py:446] TorchDynamo compilation metrics: I0408 11:54:06.739000 553124 site-packages/torch/_dynamo/utils.py:446] Function, Runtimes (s) V0408 11:54:06.739000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats constrain_symbol_range: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.739000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats defer_runtime_assert: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:06.739000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats evaluate_expr: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:06.739000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _simplify_floor_div: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.739000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _maybe_guard_rel: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:06.740000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _find: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.740000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats has_hint: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:06.740000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats size_hint: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:06.740000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats simplify: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.740000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _update_divisible: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.740000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats replace: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.740000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _maybe_evaluate_static: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.740000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats get_implications: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.741000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats get_axioms: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.741000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats _maybe_evaluate_static_worker: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0) V0408 11:54:06.741000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats safe_expand: CacheInfo(hits=0, misses=0, maxsize=256, currsize=0) V0408 11:54:06.741000 553124 site-packages/torch/fx/experimental/symbolic_shapes.py:172] lru_cache_stats uninteresting_files: CacheInfo(hits=0, misses=0, maxsize=None, currsize=0)

Apr 08 '25 03:04 GanziPo

+1 also getting this error. I will try resolve it is since I installed Triton and when doing --compile

INFO:     Started server process [3280]
INFO:     Waiting for application startup.
2025-04-18 23:04:36.004 | INFO     | fish_speech.models.text2semantic.inference:load_model:681 - Restored model from checkpoint
2025-04-18 23:04:36.005 | INFO     | fish_speech.models.text2semantic.inference:load_model:687 - Using DualARTransformer
2025-04-18 23:04:36.006 | INFO     | fish_speech.models.text2semantic.inference:load_model:695 - Compiling function...
2025-04-18 23:04:36.207 | INFO     | tools.server.model_manager:load_llama_model:99 - LLAMA model loaded.
D:\Python\Python311\Lib\site-packages\vector_quantize_pytorch\vector_quantize_pytorch.py:445: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
D:\Python\Python311\Lib\site-packages\vector_quantize_pytorch\vector_quantize_pytorch.py:630: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
D:\Python\Python311\Lib\site-packages\vector_quantize_pytorch\finite_scalar_quantization.py:147: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
D:\Python\Python311\Lib\site-packages\vector_quantize_pytorch\lookup_free_quantization.py:209: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
2025-04-18 23:04:40.237 | INFO     | fish_speech.models.vqgan.inference:load_model:46 - Loaded model: <All keys matched successfully>
2025-04-18 23:04:40.239 | INFO     | tools.server.model_manager:load_decoder_model:107 - Decoder model loaded.
2025-04-18 23:04:40.263 | INFO     | fish_speech.models.text2semantic.inference:generate_long:788 - Encoded text: Hello world.
2025-04-18 23:04:40.265 | INFO     | fish_speech.models.text2semantic.inference:generate_long:806 - Generating sentence 1/1 of sample 1/1
  0%|                                                                                         | 0/1023 [00:00<?, ?it/s]D:\Python\Python311\Lib\contextlib.py:105: FutureWarning: `torch.backends.cuda.sdp_kernel()` is deprecated. In the future, this context manager will be removed. Please see `torch.nn.attention.sdpa_kernel()` for the new context manager, with updated signature.
  self.gen = func(*args, **kwds)
  0%|                                                                                         | 0/1023 [01:07<?, ?it/s]
ERROR:    Traceback (most recent call last):
  File "D:\Python\Python311\Lib\site-packages\kui\asgi\lifespan.py", line 36, in __call__
    await result
  File "D:\2025\Call Center Agent X\fish-speech\tools\api_server.py", line 83, in initialize_app
    app.state.model_manager = ModelManager(
                              ^^^^^^^^^^^^^
  File "D:\2025\Call Center Agent X\fish-speech\tools\server\model_manager.py", line 65, in __init__
    self.warm_up(self.tts_inference_engine)
  File "D:\2025\Call Center Agent X\fish-speech\tools\server\model_manager.py", line 121, in warm_up
    list(inference(request, tts_inference_engine))
  File "D:\2025\Call Center Agent X\fish-speech\tools\server\inference.py", line 25, in inference_wrapper
    raise HTTPException(
baize.exceptions.HTTPException: (500, '\'backend=\\\'inductor\\\' raised:\\nPermissionError: [WinError 5] Access is denied: \\\'D:\\\\\\\\AppData\\\\\\\\Local\\\\\\\\Temp\\\\\\\\torchinductor_BabaWawa\\\\\\\\triton\\\\\\\\0\\\\\\\\tmp.eb6dec9b-548d-40f1-8919-66d91d1cf9cc\\\' -> \\\'D:\\\\\\\\AppData\\\\\\\\Local\\\\\\\\Temp\\\\\\\\torchinductor_BabaWawa\\\\\\\\triton\\\\\\\\0\\\\\\\\-ku0Qh8fVikssLWtdtHdeJHDiOsVPM7lsTZaqvCif38\\\'\\n\\nSet TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information\\n\\n\\nYou can suppress this exception and fall back to eager by setting:\\n    import torch._dynamo\\n    torch._dynamo.config.suppress_errors = True\\n\'')

ERROR:    Application startup failed. Exiting.

Apr 18 '25 21:04 corporate9601

This issue is stale because it has been open for 30 days with no activity.

May 19 '25 00:05 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

Jun 03 '25 00:06 github-actions[bot]