aeneas [Feature Request] Influence on accuracy

[Feature Request] Influence on accuracy

Open ErfolgreichCharismatisch opened this issue 3 years ago • 4 comments

I used the web app for aligning.

I found that 54 % of phrases in my test set were misaligned.

Misaligned meaning

Cut too early from the end
Cut too late from the previous fragment
Shifted altogether

I had voice detection control whether the aligned parts matched(aeneas -> cue list -> cut audio -> have voice recognition detect speech -> compare to entries in cue list with a similarity algorithm).

Now, more than half misaligned is discouraging.

Currently I did not see many options to influence recognition, the parameters of the cli seem rather cosmetic in influence, barring language selection and input text type.

I would like to improve alignment, have a confidence message to be able to quickly review or discard.

The pipeline contains everything required for a confidence parameter. Also other parameters for deep control are important.

What I am deeply missing is a threshold parameter in decibel to define pauses and audio - this would eliminate premature cuts for good.