UltraSinger icon indicating copy to clipboard operation
UltraSinger copied to clipboard

Ghosting in text

Open rakuri255 opened this issue 1 year ago • 3 comments

          Not complete but it is really better now.

If not text come he has a funny feature inside .. 2420 : 2652 13 6 I : 2668 19 24 don't 2687 : 2753 14 5 know 2767 : 2923 4 11 what : 2928 2 12 to : 2932 3 11 do 2935

Originally posted by @McMuffin88 in https://github.com/rakuri255/UltraSinger/issues/19#issuecomment-1567526717

rakuri255 avatar May 29 '23 21:05 rakuri255

I think this is about missing words from the speech recognition or pitch detection?

If yes, then I suggest to use configurable fallback values, e.g. an underscore for the text, a pitch of 0, and a note length of 1.

Whatever fallback values are used, it should produce a valid note syntax with : startbeat length word

achimmihca avatar Jun 19 '23 07:06 achimmihca

@achimmihca yes that would be a good idea.

I see 2 things here:

  1. When audio has some noise, than whisper hallucinate and adds random words.
  2. Whisper sometimes adds YouTube subtitles. It is only noticeable in places where there is additional information in the subtitle and no voice.

rakuri255 avatar Jun 19 '23 07:06 rakuri255

Whisper sometimes adds YouTube subtitles

That's is unexpected but actually may produce better results than speech recognition alone.

achimmihca avatar Jun 19 '23 08:06 achimmihca