josephwong14wkh
josephwong14wkh
Hi, I have encountered this error also. Have you fixed this error?
I have tested the speed of language detection. It seems that faster-whisper is slower than original OpenAI whisper on language ID. Here is my setting: **model size:** medium **compute type:**...
Yes, we need to train the content tokenizer ourselves in order to use audio as input to AR
@RMSnow Thank you for your detailed explanation. Also, may i know why you set `model.coco.codebook_dim = 8` in `contentstyle_fvq16384_12.5hz.json`, which is so small. As i know it is the dimension...
Thanks for the recommendations. I'll definitely check them out to dive deeper! I have another question about the training loss. I am training the tokenizer from scratch with my own...
Got it! Thank you very much!
In my case, just set "**emilia**" in "**dataset**" to 0 is enough. "**use_emilia_dataset** " is set to true. When "emilia" in "dataset" is set to 0, it will only load...
For my case, I did some more preprocessing step on the data in emilia dataset. So i got the numpy array from huggingface, ran the preprocessing steps, and save the...