josephwong14wkh

Results 8 comments of josephwong14wkh

Hi, I have encountered this error also. Have you fixed this error?

I have tested the speed of language detection. It seems that faster-whisper is slower than original OpenAI whisper on language ID. Here is my setting: **model size:** medium **compute type:**...

Yes, we need to train the content tokenizer ourselves in order to use audio as input to AR

@RMSnow Thank you for your detailed explanation. Also, may i know why you set `model.coco.codebook_dim = 8` in `contentstyle_fvq16384_12.5hz.json`, which is so small. As i know it is the dimension...

Thanks for the recommendations. I'll definitely check them out to dive deeper! I have another question about the training loss. I am training the tokenizer from scratch with my own...

In my case, just set "**emilia**" in "**dataset**" to 0 is enough. "**use_emilia_dataset** " is set to true. When "emilia" in "dataset" is set to 0, it will only load...

For my case, I did some more preprocessing step on the data in emilia dataset. So i got the numpy array from huggingface, ran the preprocessing steps, and save the...