UltraSinger
UltraSinger copied to clipboard
Ghosting in text
Not complete but it is really better now.
If not text come he has a funny feature inside .. 2420 : 2652 13 6 I : 2668 19 24 don't 2687 : 2753 14 5 know 2767 : 2923 4 11 what : 2928 2 12 to : 2932 3 11 do 2935
Originally posted by @McMuffin88 in https://github.com/rakuri255/UltraSinger/issues/19#issuecomment-1567526717
I think this is about missing words from the speech recognition or pitch detection?
If yes, then I suggest to use configurable fallback values, e.g. an underscore for the text, a pitch of 0, and a note length of 1.
Whatever fallback values are used, it should produce a valid note syntax with :
startbeat
length
word
@achimmihca yes that would be a good idea.
I see 2 things here:
- When audio has some noise, than whisper hallucinate and adds random words.
- Whisper sometimes adds YouTube subtitles. It is only noticeable in places where there is additional information in the subtitle and no voice.
Whisper sometimes adds YouTube subtitles
That's is unexpected but actually may produce better results than speech recognition alone.