xtts-webui
xtts-webui copied to clipboard
Enhancement Suggestions
First, thank you for always working on this project, and sorry for bothering you again.
I wanted to suggest new enhancements.
The first one is for Whisper translation: can it do it with aligning? Automatically syncing the newly created translated audio to the original voice part in order to use this as auto auto-dubbing tool?
Secondly, "Add the ability to customize speakers when batch processing" is already on the to-do list. Would adding simple command prompts inside the default input text window (not batch process) be possible? Like giving speaker or advanced setting prompts before lines:
{Adam, temp:0.75} How are you?
{Daniel, temp:0.5} Fine.
So a kind of live-batch process without creating different text files. This would be a wonderful QoL upgrade. Yes, we can do it by manually splitting every paragraph into different text files but this would be much easier to add {speaker} before required parts or so...
It would also be great to have these: -Ability to add silences with prompts like: {0.5s}, -Ability to split output by prompts in input text window like {split} or so, -Postprocess audio edit page to merge batch parts with settings like silence generation.
Thank you so much for your great work!
Hi, yes I'm already working on the first point.
The second point is also interesting and something I already had in mind. At least adding different speakers is quite possible. On pauses I have some developments that need to be tested.
If there will be progress on one of these items, I will let you know ).