VoxCPM icon indicating copy to clipboard operation
VoxCPM copied to clipboard

Request:Addition of Hindi language

Open aghammadan opened this issue 5 months ago • 3 comments

I want that we can clone english voice and genrate hindi audio.

PLS DEVS

aghammadan avatar Sep 27 '25 15:09 aghammadan

in progress :D

a710128 avatar Sep 28 '25 02:09 a710128

Any updates here?

tensorjackal avatar Dec 30 '25 09:12 tensorjackal

Any updates here?

Q1 2026, if everything goes as planned.

a710128 avatar Jan 13 '26 06:01 a710128

@a710128 If I want to do LORA fine tuning for hindi, would you recommend transliterate the text or just normalizing and moving forward in dataset preprocessing? Also, would you recommend a full fine tuning? I read that @Ayin1412 could do a simple lora fine tuning for japanese language. Any help would be greatly appreciated

darknight054 avatar Jan 19 '26 08:01 darknight054

Hello, here are some suggestions :)

  1. With limited data: Start with LoRA fine-tuning and consider transliterating the text to better leverage the existing vocabulary, since VoxCPM was not pretrained on Hindi characters.
  2. With sufficient data (e.g., hundreds of hours): You may directly use Hindi script and perform full fine-tuning, or at least unfreeze the text embedding layer if using LoRA.

Labmem-Zhouyx avatar Jan 20 '26 07:01 Labmem-Zhouyx

@Labmem-Zhouyx Thanks a lot. Will try and let you know, I have good amount of data but then it would be expensive without trying the transliterated text pipeline. Will report back.

darknight054 avatar Jan 20 '26 07:01 darknight054

@Labmem-Zhouyx I tried with roughly 32 hours of speech data on a single 12 gb gpu of hindi data and unfreezed the text embedding layer and got pretty promising results. Just in 4000 steps It can speak really nice, only some really tough letters or under appearing letters in the training data are kind of tough for it to speak but its pretty impressive. Thanks a lot for the guidance

darknight054 avatar Jan 20 '26 19:01 darknight054