xliu99
xliu99
Soundstorm is a single model that models each codebook hierarchically. It is not 2 models, in which the first one only models the first codebook, and the second modeling the...
In the g2p vocab.json file, there are 363 entries, but in the T2S model, the phone_emb table has 1024 embeddings. Why do you use a lot more entries than 363?...
Hi, Thank you so much for the great open source work. I found that in the second stage S2A model, the classifier free guidance is only applied on the acoustic...
多谢开源并提供training log。请问loss scale是如何计算的?
Hi, Thank you so much for such an amazing work. In the MNIST example, the integration interval is from 0 to 1. I understand that the initial condition is h(0),...