UniAudio
UniAudio copied to clipboard
The difference between AudioTokenizer and EncodecTokenizer?
I find 2 tokenizer models for audio, AudioTokenizer and EncodecTokenizer. In egs, tts, vc, and se all use tokenizer "audio". I guess these models are all based on SoundStream. What's the difference?