text-generation-webui
text-generation-webui copied to clipboard
Support for real-time TTS!
Hello,
I've setup TGUI with the alltalk_tts extension locally, modified the setup to allow for passing LLM replies as they're being generated (stream mode) to the extension, and to subsequently do real-time TTS (aka "incremental" TTS).
PR for the extension is in the backlog too, streaming TTS is working as expected locally, this one is for the parts in TGUI I needed to adjust/extend to allow this to work smoothly.
Mainly, 2 changes were needed:
- Add an
output_modifier_stream
handler for extensions (works only for chat-mode currently) as the enabler for streaming the LLM text to extensions - Do the chat HTML updates structurally and "incrementally" ("diff" mode) - only update what's needed using JS, this was needed because "audio" elements in the chat HTML were previously continuously re-rendered and made audio streaming not possible
(the rest are miscellaneous changes - adding a llama3 instruction template and a commented line to allow remotely debugging TGUI)
Let me know what you think and btw, nice project! Thanks!
Checklist:
- [x] I have read the Contributing guidelines.
Please support instruct mode
@oobabooga please consider this 🙏
Hey @czuzu Would you consider making this for SillyTavern? Given that you listed only 2 things for it, I thought I'd just ask, if it's no trouble.
I gotta check the PR list more often, this is something I've needed for a while. Thank God textgen is open source and I can implement these changes on my own rig. Ty❤️❤️