Shreeram Chandra
Shreeram Chandra
I am currently training S1 from scratch as described in the paper as an ablation study. The paper states that the authors use a decoder only architecture and a 12-layer...
Can you please elaborate on the role of speaker embeddings in the hidden unit tokenizer and what effect it has?
I am currently training the hidden unit tokenizer to predict speech units from text token ids. Although the accuracy of the model continuously increases, I am unable to judge whether...
The link to the training data file seems to be broken : https://drive.google.com/file/d/1rxlikMglL2kEsF4NfqekZRoA02klY7CE/view?usp=sharing
Thank you for putting up this code. I am interested in the txt2vec model (that you said works well in the other issue). Is the training stable? How long does...