fish-speech icon indicating copy to clipboard operation
fish-speech copied to clipboard

"Negative code found" error for short input texts.

Open twocode opened this issue 11 months ago • 2 comments

Self Checks

  • [X] This template is only for bug reports. For questions, please visit Discussions.
  • [X] I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文 日本語 Portuguese (Brazil)
  • [X] I have searched for existing issues, including closed ones. Search issues
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [X] Please do not modify this template and fill in all required fields.

Cloud or Self Hosted

Self Hosted (Docker)

Environment Details

Tesla T4

Steps to Reproduce

"python", "-m", "tools.api", \
"--listen", "0.0.0.0:8080", \
"--llama-checkpoint-path", "checkpoints/fish-speech-1.4", \
"--decoder-checkpoint-path", "checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth", \
"--decoder-config-name", "firefly_gan_vq", \
"--compile", \
"--half" \

✔️ Expected Behavior

When input is short like hi, he the correct audio should be generated, stably.

❌ Actual Behavior

It randomly succeeds or fails. Error message is AssertionError: Negative code found

2025-01-04 15:59:09.779 | INFO     | tools.llama.generate:generate_long:759 - Encoded text: hi.
2025-01-04 15:59:09.779 | INFO     | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
  1%|          | 11/1023 [00:00<00:11, 87.47it/s]
2025-01-04 15:59:09.989 | INFO     | tools.llama.generate:generate_long:823 - Compilation time: 0.21 seconds
2025-01-04 15:59:09.989 | INFO     | tools.llama.generate:generate_long:832 - Generated 13 tokens in 0.21 seconds, 62.07 tokens/sec
2025-01-04 15:59:09.989 | INFO     | tools.llama.generate:generate_long:835 - Bandwidth achieved: 30.69 GB/s
2025-01-04 15:59:09.990 | INFO     | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/exceptions.py", line 27, in wrapper
    return await endpoint()
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/views.py", line 29, in wrapper
    return await function()
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/parameters.py", line 119, in callback_with_auto_bound_params
    result = await result
             ^^^^^^^^^^^^
  File "/opt/fish-speech/tools/api.py", line 756, in api_invoke_model
    fake_audios = next(inference(req))
                  ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 36, in generator_context
    response = gen.send(None)
               ^^^^^^^^^^^^^^
  File "/opt/fish-speech/tools/api.py", line 683, in inference
    raise result.response
  File "/opt/fish-speech/tools/llama/generate.py", line 904, in worker
    for chunk in generate_long(
                 ^^^^^^^^^^^^^^
  File "/opt/fish-speech/tools/llama/generate.py", line 848, in generate_long
    assert (codes >= 0).all(), f"Negative code found"
           ^^^^^^^^^^^^^^^^^^
AssertionError: Negative code found
INFO:     10.0.3.136:35684 - "POST /v1/tts HTTP/1.1" 500 Internal Server Error
2025-01-04 15:59:16.891 | INFO     | tools.api:inference:623 - Use same references
2025-01-04 15:59:16.894 | INFO     | tools.llama.generate:generate_long:759 - Encoded text: he.
2025-01-04 15:59:16.894 | INFO     | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
  0%|          | 2/1023 [00:00<00:15, 64.00it/s]
2025-01-04 15:59:17.010 | INFO     | tools.llama.generate:generate_long:823 - Compilation time: 0.12 seconds
2025-01-04 15:59:17.010 | INFO     | tools.llama.generate:generate_long:832 - Generated 4 tokens in 0.12 seconds, 34.65 tokens/sec
2025-01-04 15:59:17.010 | INFO     | tools.llama.generate:generate_long:835 - Bandwidth achieved: 17.13 GB/s
2025-01-04 15:59:17.011 | INFO     | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/exceptions.py", line 27, in wrapper
    return await endpoint()
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/views.py", line 29, in wrapper
    return await function()
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/parameters.py", line 119, in callback_with_auto_bound_params
    result = await result
             ^^^^^^^^^^^^
  File "/opt/fish-speech/tools/api.py", line 756, in api_invoke_model
    fake_audios = next(inference(req))
                  ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 36, in generator_context
    response = gen.send(None)
               ^^^^^^^^^^^^^^
  File "/opt/fish-speech/tools/api.py", line 683, in inference
    raise result.response
  File "/opt/fish-speech/tools/llama/generate.py", line 904, in worker
    for chunk in generate_long(
                 ^^^^^^^^^^^^^^
  File "/opt/fish-speech/tools/llama/generate.py", line 848, in generate_long
    assert (codes >= 0).all(), f"Negative code found"
           ^^^^^^^^^^^^^^^^^^
AssertionError: Negative code found
INFO:     10.0.3.136:45164 - "POST /v1/tts HTTP/1.1" 500 Internal Server Error
2025-01-04 15:59:21.064 | INFO     | tools.api:inference:623 - Use same references
2025-01-04 15:59:21.066 | INFO     | tools.llama.generate:generate_long:759 - Encoded text: Hee
2025-01-04 15:59:21.067 | INFO     | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
  1%|          | 12/1023 [00:00<00:11, 87.44it/s]
2025-01-04 15:59:21.289 | INFO     | tools.llama.generate:generate_long:823 - Compilation time: 0.22 seconds
2025-01-04 15:59:21.289 | INFO     | tools.llama.generate:generate_long:832 - Generated 14 tokens in 0.22 seconds, 63.06 tokens/sec
2025-01-04 15:59:21.289 | INFO     | tools.llama.generate:generate_long:835 - Bandwidth achieved: 31.18 GB/s
2025-01-04 15:59:21.290 | INFO     | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/exceptions.py", line 27, in wrapper
    return await endpoint()
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/views.py", line 29, in wrapper
    return await function()
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/kui/asgi/parameters.py", line 119, in callback_with_auto_bound_params
    result = await result
             ^^^^^^^^^^^^
  File "/opt/fish-speech/tools/api.py", line 756, in api_invoke_model
    fake_audios = next(inference(req))
                  ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 36, in generator_context
    response = gen.send(None)
               ^^^^^^^^^^^^^^
  File "/opt/fish-speech/tools/api.py", line 683, in inference
    raise result.response
  File "/opt/fish-speech/tools/llama/generate.py", line 904, in worker
    for chunk in generate_long(
                 ^^^^^^^^^^^^^^
  File "/opt/fish-speech/tools/llama/generate.py", line 848, in generate_long
    assert (codes >= 0).all(), f"Negative code found"
           ^^^^^^^^^^^^^^^^^^
AssertionError: Negative code found
INFO:     10.0.3.136:45178 - "POST /v1/tts HTTP/1.1" 500 Internal Server Error
2025-01-04 16:01:01.233 | INFO     | tools.api:inference:623 - Use same references
2025-01-04 16:01:01.236 | INFO     | tools.llama.generate:generate_long:759 - Encoded text: what
2025-01-04 16:01:01.236 | INFO     | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
  1%|▏         | 15/1023 [00:00<00:11, 89.25it/s]
2025-01-04 16:01:01.488 | INFO     | tools.llama.generate:generate_long:823 - Compilation time: 0.25 seconds
2025-01-04 16:01:01.489 | INFO     | tools.llama.generate:generate_long:832 - Generated 17 tokens in 0.25 seconds, 67.47 tokens/sec
2025-01-04 16:01:01.489 | INFO     | tools.llama.generate:generate_long:835 - Bandwidth achieved: 33.36 GB/s
2025-01-04 16:01:01.489 | INFO     | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
2025-01-04 16:01:01.490 | INFO     | tools.api:decode_vq_tokens:189 - VQ features: torch.Size([8, 16])
INFO:     10.0.3.136:52978 - "POST /v1/tts HTTP/1.1" 200 OK
2025-01-04 16:01:10.114 | INFO     | tools.api:inference:623 - Use same references
2025-01-04 16:01:10.116 | INFO     | tools.llama.generate:generate_long:759 - Encoded text: Oh .
2025-01-04 16:01:10.117 | INFO     | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
  2%|▏         | 18/1023 [00:00<00:11, 89.46it/s]
2025-01-04 16:01:10.403 | INFO     | tools.llama.generate:generate_long:823 - Compilation time: 0.29 seconds
2025-01-04 16:01:10.403 | INFO     | tools.llama.generate:generate_long:832 - Generated 20 tokens in 0.29 seconds, 69.87 tokens/sec
2025-01-04 16:01:10.404 | INFO     | tools.llama.generate:generate_long:835 - Bandwidth achieved: 34.55 GB/s
2025-01-04 16:01:10.404 | INFO     | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
2025-01-04 16:01:10.405 | INFO     | tools.api:decode_vq_tokens:189 - VQ features: torch.Size([8, 19])
INFO:     10.0.3.136:52980 - "POST /v1/tts HTTP/1.1" 200 OK
2025-01-04 16:01:22.751 | INFO     | tools.api:inference:623 - Use same references
2025-01-04 16:01:22.754 | INFO     | tools.llama.generate:generate_long:759 - Encoded text: Hi.
2025-01-04 16:01:22.755 | INFO     | tools.llama.generate:generate_long:777 - Generating sentence 1/1 of sample 1/1
  2%|▏         | 24/1023 [00:00<00:11, 90.80it/s]
2025-01-04 16:01:23.104 | INFO     | tools.llama.generate:generate_long:823 - Compilation time: 0.35 seconds
2025-01-04 16:01:23.104 | INFO     | tools.llama.generate:generate_long:832 - Generated 26 tokens in 0.35 seconds, 74.49 tokens/sec
2025-01-04 16:01:23.104 | INFO     | tools.llama.generate:generate_long:835 - Bandwidth achieved: 36.83 GB/s
2025-01-04 16:01:23.105 | INFO     | tools.llama.generate:generate_long:840 - GPU Memory used: 2.43 GB
2025-01-04 16:01:23.106 | INFO     | tools.api:decode_vq_tokens:189 - VQ features: torch.Size([8, 25])
INFO:     10.0.3.136:46966 - "POST /v1/tts HTTP/1.1" 200 OK

twocode avatar Jan 04 '25 16:01 twocode

The same issue exists in my case. Does anyone can guide how to fix?

David-19940718 avatar Jan 08 '25 06:01 David-19940718

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Feb 08 '25 00:02 github-actions[bot]

Fixed.

PoTaTo-Mika avatar Sep 22 '25 03:09 PoTaTo-Mika

Hi @PoTaTo-Mika thanks for the followup. Could you let me know how it was fixed and which release carries this fix? Thanks.

twocode avatar Sep 22 '25 04:09 twocode