shekharmeena2896

Results 4 comments of shekharmeena2896

i can configure the tts of openai or elevenlabs to give me the audio in wav or pcm format, its realtime , i have the choice of streamin the audio...

ading images... 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00 Invalid number of channels in input image: > 'VScn::contains(scn)' > where > 'scn' is 1 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 229/229 [00:31

I see many tts todays like orpheus, dia tts and sesame ai tts, and maybe eleven labs. the all have language model in btw the architecture , that helps drive...

I would recommend not use the vits because its fairly old architecture , you should try Orpheus tts , use the base hindi model and finetune it on punjabi data