fish-speech
fish-speech copied to clipboard
"Negative code found" error for short input texts.
Self Checks
- [X] This template is only for bug reports. For questions, please visit Discussions.
- [X] I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文 日本語 Portuguese (Brazil)
- [X] I have searched for existing issues, including closed ones. Search issues
- [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [X] Please do not modify this template and fill in all required fields.
Cloud or Self Hosted
Self Hosted (Docker)
Environment Details
Tesla T4
Steps to Reproduce
"python", "-m", "tools.api", \
"--listen", "0.0.0.0:8080", \
"--llama-checkpoint-path", "checkpoints/fish-speech-1.4", \
"--decoder-checkpoint-path", "checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth", \
"--decoder-config-name", "firefly_gan_vq", \
"--compile", \
"--half" \
✔️ Expected Behavior
When input is short like hi, he the correct audio should be generated, stably.
❌ Actual Behavior
It randomly succeeds or fails. Error message is AssertionError: Negative code found
2025-01-04 15:59:09.779 | INFO | tools.llama.generate:generate_long:759 - Encoded text: hi.
2025-01-04 15:59:09.779 | INFO | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
1%| | 11/1023 [00:00<00:11, 87.47it/s]
2025-01-04 15:59:09.989 | INFO | tools.llama.generate:generate_long:823 - Compilation time: 0.21 seconds
2025-01-04 15:59:09.989 | INFO | tools.llama.generate:generate_long:832 - Generated 13 tokens in 0.21 seconds, 62.07 tokens/sec
2025-01-04 15:59:09.989 | INFO | tools.llama.generate:generate_long:835 - Bandwidth achieved: 30.69 GB/s
2025-01-04 15:59:09.990 | INFO | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/kui/asgi/exceptions.py", line 27, in wrapper
return await endpoint()
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/kui/asgi/views.py", line 29, in wrapper
return await function()
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/kui/asgi/parameters.py", line 119, in callback_with_auto_bound_params
result = await result
^^^^^^^^^^^^
File "/opt/fish-speech/tools/api.py", line 756, in api_invoke_model
fake_audios = next(inference(req))
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 36, in generator_context
response = gen.send(None)
^^^^^^^^^^^^^^
File "/opt/fish-speech/tools/api.py", line 683, in inference
raise result.response
File "/opt/fish-speech/tools/llama/generate.py", line 904, in worker
for chunk in generate_long(
^^^^^^^^^^^^^^
File "/opt/fish-speech/tools/llama/generate.py", line 848, in generate_long
assert (codes >= 0).all(), f"Negative code found"
^^^^^^^^^^^^^^^^^^
AssertionError: Negative code found
INFO: 10.0.3.136:35684 - "POST /v1/tts HTTP/1.1" 500 Internal Server Error
2025-01-04 15:59:16.891 | INFO | tools.api:inference:623 - Use same references
2025-01-04 15:59:16.894 | INFO | tools.llama.generate:generate_long:759 - Encoded text: he.
2025-01-04 15:59:16.894 | INFO | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
0%| | 2/1023 [00:00<00:15, 64.00it/s]
2025-01-04 15:59:17.010 | INFO | tools.llama.generate:generate_long:823 - Compilation time: 0.12 seconds
2025-01-04 15:59:17.010 | INFO | tools.llama.generate:generate_long:832 - Generated 4 tokens in 0.12 seconds, 34.65 tokens/sec
2025-01-04 15:59:17.010 | INFO | tools.llama.generate:generate_long:835 - Bandwidth achieved: 17.13 GB/s
2025-01-04 15:59:17.011 | INFO | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/kui/asgi/exceptions.py", line 27, in wrapper
return await endpoint()
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/kui/asgi/views.py", line 29, in wrapper
return await function()
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/kui/asgi/parameters.py", line 119, in callback_with_auto_bound_params
result = await result
^^^^^^^^^^^^
File "/opt/fish-speech/tools/api.py", line 756, in api_invoke_model
fake_audios = next(inference(req))
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 36, in generator_context
response = gen.send(None)
^^^^^^^^^^^^^^
File "/opt/fish-speech/tools/api.py", line 683, in inference
raise result.response
File "/opt/fish-speech/tools/llama/generate.py", line 904, in worker
for chunk in generate_long(
^^^^^^^^^^^^^^
File "/opt/fish-speech/tools/llama/generate.py", line 848, in generate_long
assert (codes >= 0).all(), f"Negative code found"
^^^^^^^^^^^^^^^^^^
AssertionError: Negative code found
INFO: 10.0.3.136:45164 - "POST /v1/tts HTTP/1.1" 500 Internal Server Error
2025-01-04 15:59:21.064 | INFO | tools.api:inference:623 - Use same references
2025-01-04 15:59:21.066 | INFO | tools.llama.generate:generate_long:759 - Encoded text: Hee
2025-01-04 15:59:21.067 | INFO | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
1%| | 12/1023 [00:00<00:11, 87.44it/s]
2025-01-04 15:59:21.289 | INFO | tools.llama.generate:generate_long:823 - Compilation time: 0.22 seconds
2025-01-04 15:59:21.289 | INFO | tools.llama.generate:generate_long:832 - Generated 14 tokens in 0.22 seconds, 63.06 tokens/sec
2025-01-04 15:59:21.289 | INFO | tools.llama.generate:generate_long:835 - Bandwidth achieved: 31.18 GB/s
2025-01-04 15:59:21.290 | INFO | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/kui/asgi/exceptions.py", line 27, in wrapper
return await endpoint()
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/kui/asgi/views.py", line 29, in wrapper
return await function()
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/kui/asgi/parameters.py", line 119, in callback_with_auto_bound_params
result = await result
^^^^^^^^^^^^
File "/opt/fish-speech/tools/api.py", line 756, in api_invoke_model
fake_audios = next(inference(req))
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 36, in generator_context
response = gen.send(None)
^^^^^^^^^^^^^^
File "/opt/fish-speech/tools/api.py", line 683, in inference
raise result.response
File "/opt/fish-speech/tools/llama/generate.py", line 904, in worker
for chunk in generate_long(
^^^^^^^^^^^^^^
File "/opt/fish-speech/tools/llama/generate.py", line 848, in generate_long
assert (codes >= 0).all(), f"Negative code found"
^^^^^^^^^^^^^^^^^^
AssertionError: Negative code found
INFO: 10.0.3.136:45178 - "POST /v1/tts HTTP/1.1" 500 Internal Server Error
2025-01-04 16:01:01.233 | INFO | tools.api:inference:623 - Use same references
2025-01-04 16:01:01.236 | INFO | tools.llama.generate:generate_long:759 - Encoded text: what
2025-01-04 16:01:01.236 | INFO | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
1%|▏ | 15/1023 [00:00<00:11, 89.25it/s]
2025-01-04 16:01:01.488 | INFO | tools.llama.generate:generate_long:823 - Compilation time: 0.25 seconds
2025-01-04 16:01:01.489 | INFO | tools.llama.generate:generate_long:832 - Generated 17 tokens in 0.25 seconds, 67.47 tokens/sec
2025-01-04 16:01:01.489 | INFO | tools.llama.generate:generate_long:835 - Bandwidth achieved: 33.36 GB/s
2025-01-04 16:01:01.489 | INFO | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
2025-01-04 16:01:01.490 | INFO | tools.api:decode_vq_tokens:189 - VQ features: torch.Size([8, 16])
INFO: 10.0.3.136:52978 - "POST /v1/tts HTTP/1.1" 200 OK
2025-01-04 16:01:10.114 | INFO | tools.api:inference:623 - Use same references
2025-01-04 16:01:10.116 | INFO | tools.llama.generate:generate_long:759 - Encoded text: Oh .
2025-01-04 16:01:10.117 | INFO | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
2%|▏ | 18/1023 [00:00<00:11, 89.46it/s]
2025-01-04 16:01:10.403 | INFO | tools.llama.generate:generate_long:823 - Compilation time: 0.29 seconds
2025-01-04 16:01:10.403 | INFO | tools.llama.generate:generate_long:832 - Generated 20 tokens in 0.29 seconds, 69.87 tokens/sec
2025-01-04 16:01:10.404 | INFO | tools.llama.generate:generate_long:835 - Bandwidth achieved: 34.55 GB/s
2025-01-04 16:01:10.404 | INFO | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
2025-01-04 16:01:10.405 | INFO | tools.api:decode_vq_tokens:189 - VQ features: torch.Size([8, 19])
INFO: 10.0.3.136:52980 - "POST /v1/tts HTTP/1.1" 200 OK
2025-01-04 16:01:22.751 | INFO | tools.api:inference:623 - Use same references
2025-01-04 16:01:22.754 | INFO | tools.llama.generate:generate_long:759 - Encoded text: Hi.
2025-01-04 16:01:22.755 | INFO | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
2%|▏ | 24/1023 [00:00<00:11, 90.80it/s]
2025-01-04 16:01:23.104 | INFO | tools.llama.generate:generate_long:823 - Compilation time: 0.35 seconds
2025-01-04 16:01:23.104 | INFO | tools.llama.generate:generate_long:832 - Generated 26 tokens in 0.35 seconds, 74.49 tokens/sec
2025-01-04 16:01:23.104 | INFO | tools.llama.generate:generate_long:835 - Bandwidth achieved: 36.83 GB/s
2025-01-04 16:01:23.105 | INFO | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
2025-01-04 16:01:23.106 | INFO | tools.api:decode_vq_tokens:189 - VQ features: torch.Size([8, 25])
INFO: 10.0.3.136:46966 - "POST /v1/tts HTTP/1.1" 200 OK
The same issue exists in my case. Does anyone can guide how to fix?
This issue is stale because it has been open for 30 days with no activity.
Fixed.
Hi @PoTaTo-Mika thanks for the followup. Could you let me know how it was fixed and which release carries this fix? Thanks.