DeepLearningExamples icon indicating copy to clipboard operation
DeepLearningExamples copied to clipboard

FastPitch - attention not conditioned on speaker embedding

Open Alexey322 opened this issue 1 year ago • 0 comments

Hello. Why the attention doesn't use speaker embeddings to find the alignment between text and mel spectrograms? This can vary greatly between speakers speaking in different styles.

Alexey322 avatar May 09 '23 18:05 Alexey322