xtts-webui icon indicating copy to clipboard operation
xtts-webui copied to clipboard

Enhancement Suggestions

Open GalenMarek14 opened this issue 1 year ago • 1 comments

First, thank you for always working on this project, and sorry for bothering you again.

I wanted to suggest new enhancements.

The first one is for Whisper translation: can it do it with aligning? Automatically syncing the newly created translated audio to the original voice part in order to use this as auto auto-dubbing tool?

Secondly, "Add the ability to customize speakers when batch processing" is already on the to-do list. Would adding simple command prompts inside the default input text window (not batch process) be possible? Like giving speaker or advanced setting prompts before lines:

{Adam, temp:0.75} How are you?
{Daniel, temp:0.5} Fine.

So a kind of live-batch process without creating different text files. This would be a wonderful QoL upgrade. Yes, we can do it by manually splitting every paragraph into different text files but this would be much easier to add {speaker} before required parts or so...

It would also be great to have these: -Ability to add silences with prompts like: {0.5s}, -Ability to split output by prompts in input text window like {split} or so, -Postprocess audio edit page to merge batch parts with settings like silence generation.

Thank you so much for your great work!

GalenMarek14 avatar Feb 07 '24 02:02 GalenMarek14

Hi, yes I'm already working on the first point.

The second point is also interesting and something I already had in mind. At least adding different speakers is quite possible. On pauses I have some developments that need to be tested.

If there will be progress on one of these items, I will let you know ).

daswer123 avatar Feb 07 '24 04:02 daswer123