VoiceCraft
VoiceCraft copied to clipboard
about silence tokens during inference
i see that the default values for silence_tokens during inference are [1388,1898,131]. my questions:
- why is there more than one silence token?
- how do
silence_tokensdiffer from the<SIL>phoneme in vocab.txt? - how can i find the silence tokens when training on my own dataset?
@thivux Have figured it out? At least how can one find the silence tokens?