Rotem Dan
Rotem Dan
This is a common problem with Whisper: when it encounters silence or non-speech segments, it may hallucinate and start to repeat a token pattern, like: ``` Thanks for watching! Thanks...
Tokens are already decoded and displayed live during Whisper decoding, at least on the CLI. Getting Whisper to recognize in real-time (or at least near real-time) is possible. However: *...
Beam search would enable the decoder to consider multiple recognitions simultaneously. Currently not a high priority, because of several reasons: * The goal of the Whisper implementation is a good...
Add support for sentence templates, which split the output files according to sentences boundaries. Like `echogarden speak-file text.txt /parts/[sentence].wav`.
Allow the user to pass a path to a custom VITS model, not in the package manager. Please let me know if you need this feature and I'll prioritize it!
Many additional issues, enhancements and ideas are listed on the [**task list document**](https://github.com/echogarden-project/echogarden/blob/main/docs/Tasklist.md). A large number of them have been added long before the project was posted to GitHub (late...
Input validation error: `inputs` tokens + `max_new_tokens` must be
## Bug description When the the mouse is located at the right edge of the viewport, the scroll bar is not responsive to clicks (pressing the left mouse button would...
### Version v22.9.0, v23.0.0 ### Platform ```text Windows 11 x64 Microsoft Windows NT 10.0.22631.0 x64 ``` ### Subsystem Buffer ### What steps will reproduce the bug? ```js const largeBuffer =...
### Self Checks - [X] This template is only for bug reports. For questions, please visit [Discussions](https://github.com/fishaudio/fish-speech/discussions). - [X] I have thoroughly reviewed the project documentation (installation, training, inference) but...