aeneas
aeneas copied to clipboard
[Feature Request] Influence on accuracy
I used the web app for aligning.
I found that 54 % of phrases in my test set were misaligned.
Misaligned meaning
- Cut too early from the end
- Cut too late from the previous fragment
- Shifted altogether
I had voice detection control whether the aligned parts matched(aeneas -> cue list -> cut audio -> have voice recognition detect speech -> compare to entries in cue list with a similarity algorithm).
Now, more than half misaligned is discouraging.
Currently I did not see many options to influence recognition, the parameters of the cli seem rather cosmetic in influence, barring language selection and input text type.
I would like to improve alignment, have a confidence message to be able to quickly review or discard.
The pipeline contains everything required for a confidence parameter. Also other parameters for deep control are important.
What I am deeply missing is a threshold parameter in decibel to define pauses and audio - this would eliminate premature cuts for good.
Did you find any good solution to the misalignments?
Yes I did. I abandoned aeneas. Not what you wanted to hear, but that's it.
I see. Are you using any other solution that provides satisfactory results?
What other library do you use now?