Newest Update seems to have broken webui/manage.py
Self Checks
- [X] This template is only for bug reports. For questions, please visit Discussions.
- [X] I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文 日本語 Portuguese (Brazil)
- [X] I have searched for existing issues, including closed ones. Search issues
- [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [X] Please do not modify this template and fill in all required fields.
Cloud or Self Hosted
Self Hosted (Source)
Environment Details
Win 11
Steps to Reproduce
Everythin worked good up till Dececember 7 update. Now start.bat gives this error when runnning --infer fish-speech\fish_speech\webui\manage.py': [Errno 2] No such file or directory
This is the last update that worked: https://github.com/fishaudio/fish-speech/tree/b951de3b724a0763a5f4f7fcbfda9849f4199e19/fish_speech
✔️ Expected Behavior
SHould run webui
❌ Actual Behavior
errorrs out w/ not webui/manage.py error
The recent PR cause the bug, you can use released .5 instead for a while. We fixed it with next update.
The problem still exists.
Start WebUI Inference...
Debug: flags = --llama-checkpoint-path "checkpoints/fish-speech-1.5" --decoder-checkpoint-path "checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth" --decoder-config-name firefly_gan_vq
C:\Local_AI\fish-speech\fishenv\env\python.exe: No module named tools.webui.__main__; 'tools.webui' is a package and cannot be directly executed
Next launch the page...
C:\Local_AI\fish-speech\fishenv\env\python: can't open file 'C:\\Local_AI\\fish-speech\\fish_speech\\webui\\manage.py': [Errno 2] No such file or directory
Press any key to continue . . .
Please use the release v1.5 for windows webui. We haven't repaired it due to holiday and some business qwq.
@Whale-Dolphin I appreciate the fast response
I did that and got it working, with a peculiar glitch
the output always leads with 'speakyip' regardless of what the input text is...
is this an easy fix due to a know reason or...
Thank you for your feedback. This issue did exist in the early 1.5 versions, but it should have been fixed. But because we did not test on windows so I'm not sure if it works, we'll track it then.
Thank you for your feedback. This issue did exist in the early 1.5 versions, but it should have been fixed. But because we did not test on windows so I'm not sure if it works, we'll track it then.
I just installed 1.4 on Ubuntu and got the same glitch occurring in the webui. It always precedes outputs with speakip.
edit: upon looking closely at this section within the encode_tokens in (tools.llama.generate) function:
string = clean_text(string)
string = f"<|im_start|>user\nSpeak: {string}<|im_end|><|im_start|>assistant\n"
Use code with caution.
Python
Explanation:
string = clean_text(string): This line cleans the input string using the clean_text function (likely removing special characters or normalizing the text - we'd need to see the clean_text function definition to know exactly what it does).
string = f"<|im_start|>user\nSpeak: {string}<|im_end|><|im_start|>assistant\n": This line is constructing the prompt that is fed to the model. It's using an f-string to format the input string into a specific structure:
<|im_start|>user\n: Start-of-image/instruction token for the "user" role, followed by a newline.
Speak: {string}: Here it is! The hardcoded "Speak: " prefix is being added directly to your input text.
<|im_end|>\n: End-of-image/instruction token, followed by a newline.
<|im_start|>assistant\n: Start-of-image/instruction token for the "assistant" role, followed by a newline.
We decided to drop windows development for a while because it really hard to build, we recommend you use docker or wsl instead. We welcome any windows package or further development! And we'll try to fix it once we have enough time. Sorry for the inconvenience.