aeneas icon indicating copy to clipboard operation
aeneas copied to clipboard

[Feature Request] Influence on accuracy

Open ErfolgreichCharismatisch opened this issue 3 years ago • 4 comments

I used the web app for aligning.

I found that 54 % of phrases in my test set were misaligned.

Misaligned meaning

  1. Cut too early from the end
  2. Cut too late from the previous fragment
  3. Shifted altogether

I had voice detection control whether the aligned parts matched(aeneas -> cue list -> cut audio -> have voice recognition detect speech -> compare to entries in cue list with a similarity algorithm).

Now, more than half misaligned is discouraging.

Currently I did not see many options to influence recognition, the parameters of the cli seem rather cosmetic in influence, barring language selection and input text type.

I would like to improve alignment, have a confidence message to be able to quickly review or discard.

The pipeline contains everything required for a confidence parameter. Also other parameters for deep control are important.

What I am deeply missing is a threshold parameter in decibel to define pauses and audio - this would eliminate premature cuts for good.

Did you find any good solution to the misalignments?

versae avatar Feb 16 '23 16:02 versae

Yes I did. I abandoned aeneas. Not what you wanted to hear, but that's it.

I see. Are you using any other solution that provides satisfactory results?

versae avatar Feb 16 '23 22:02 versae

What other library do you use now?

ErrorBot1122 avatar Feb 24 '23 14:02 ErrorBot1122