fish-speech icon indicating copy to clipboard operation
fish-speech copied to clipboard

Newest Update seems to have broken webui/manage.py

Open gjnave opened this issue 11 months ago • 7 comments

Self Checks

  • [X] This template is only for bug reports. For questions, please visit Discussions.
  • [X] I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文 日本語 Portuguese (Brazil)
  • [X] I have searched for existing issues, including closed ones. Search issues
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [X] Please do not modify this template and fill in all required fields.

Cloud or Self Hosted

Self Hosted (Source)

Environment Details

Win 11

Steps to Reproduce

Everythin worked good up till Dececember 7 update. Now start.bat gives this error when runnning --infer fish-speech\fish_speech\webui\manage.py': [Errno 2] No such file or directory

This is the last update that worked: https://github.com/fishaudio/fish-speech/tree/b951de3b724a0763a5f4f7fcbfda9849f4199e19/fish_speech

✔️ Expected Behavior

SHould run webui

❌ Actual Behavior

errorrs out w/ not webui/manage.py error

gjnave avatar Jan 14 '25 20:01 gjnave

The recent PR cause the bug, you can use released .5 instead for a while. We fixed it with next update.

Whale-Dolphin avatar Jan 16 '25 02:01 Whale-Dolphin

The problem still exists.

justlovemaki avatar Jan 29 '25 14:01 justlovemaki

Start WebUI Inference...
Debug: flags = --llama-checkpoint-path "checkpoints/fish-speech-1.5" --decoder-checkpoint-path "checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth" --decoder-config-name firefly_gan_vq
C:\Local_AI\fish-speech\fishenv\env\python.exe: No module named tools.webui.__main__; 'tools.webui' is a package and cannot be directly executed

Next launch the page...
C:\Local_AI\fish-speech\fishenv\env\python: can't open file 'C:\\Local_AI\\fish-speech\\fish_speech\\webui\\manage.py': [Errno 2] No such file or directory
Press any key to continue . . .

sl33pyC01E avatar Feb 03 '25 05:02 sl33pyC01E

Please use the release v1.5 for windows webui. We haven't repaired it due to holiday and some business qwq.

Whale-Dolphin avatar Feb 03 '25 05:02 Whale-Dolphin

@Whale-Dolphin I appreciate the fast response

I did that and got it working, with a peculiar glitch

the output always leads with 'speakyip' regardless of what the input text is...

is this an easy fix due to a know reason or...

sl33pyC01E avatar Feb 03 '25 06:02 sl33pyC01E

Thank you for your feedback. This issue did exist in the early 1.5 versions, but it should have been fixed. But because we did not test on windows so I'm not sure if it works, we'll track it then.

Whale-Dolphin avatar Feb 03 '25 06:02 Whale-Dolphin

Thank you for your feedback. This issue did exist in the early 1.5 versions, but it should have been fixed. But because we did not test on windows so I'm not sure if it works, we'll track it then.

I just installed 1.4 on Ubuntu and got the same glitch occurring in the webui. It always precedes outputs with speakip.

edit: upon looking closely at this section within the encode_tokens in (tools.llama.generate) function:

string = clean_text(string)
    string = f"<|im_start|>user\nSpeak: {string}<|im_end|><|im_start|>assistant\n"
Use code with caution.
Python
Explanation:

string = clean_text(string): This line cleans the input string using the clean_text function (likely removing special characters or normalizing the text - we'd need to see the clean_text function definition to know exactly what it does).

string = f"<|im_start|>user\nSpeak: {string}<|im_end|><|im_start|>assistant\n": This line is constructing the prompt that is fed to the model. It's using an f-string to format the input string into a specific structure:

<|im_start|>user\n: Start-of-image/instruction token for the "user" role, followed by a newline.

Speak: {string}: Here it is! The hardcoded "Speak: " prefix is being added directly to your input text.

<|im_end|>\n: End-of-image/instruction token, followed by a newline.

<|im_start|>assistant\n: Start-of-image/instruction token for the "assistant" role, followed by a newline.

sl33pyC01E avatar Feb 04 '25 02:02 sl33pyC01E

We decided to drop windows development for a while because it really hard to build, we recommend you use docker or wsl instead. We welcome any windows package or further development! And we'll try to fix it once we have enough time. Sorry for the inconvenience.

Whale-Dolphin avatar Sep 21 '25 05:09 Whale-Dolphin