g2p-seq2seq
g2p-seq2seq copied to clipboard
Long words - right bucket size/parameters ?
Hi folks,
I trained a model for German and now I'm struggling with predicted output for longer words (e.g. 39 letters ≙ 34 phones, yeah German...). Meaning for the predicted words the last phones are repeated over and over again.
So for training I set max_length=50
. The results got better but there are some phone repetitions still.
How do the other to bucket parameters influence the predicted transcriptions?
Thanks alot!
You'd better try something modern transformer architecture, not seq2seq.
Alright, which ones would you suggest?
Maybe https://github.com/hajix/G2P