vall-e Hello. A question about training. Is Force alignment of phoneme to audio before audio encoding necessary?

Hello. A question about training. Is Force alignment of phoneme to audio before audio encoding necessary?

Open constan1 opened this issue 1 year ago • 0 comments

Or Does the LM handle alignment during the self attention process? I read in the valle paper they use force alignment tools, but I dont see anything in the code.

Jun 22 '23 18:06 constan1

vall-e vall-e copied to clipboard

Hello. A question about training. Is Force alignment of phoneme to audio before audio encoding necessary?

vall-e
vall-e copied to clipboard