shekharmeena2896 comments

Results 4 comments of


                                            shekharmeena2896

How can I make it work for incoming stream of audio

i can configure the tts of openai or elevenlabs to give me the audio in wav or pcm format, its realtime , i have the choice of streamin the audio...

does not work on new video

ading images... 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00 Invalid number of channels in input image: > 'VScn::contains(scn)' > where > 'scn' is 1 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 229/229 [00:31

Can I replace the espeak-ng phonemizer with neural g2p

I see many tts todays like orpheus, dia tts and sesame ai tts, and maybe eleven labs. the all have language model in btw the architecture , that helps drive...

VITS fails to synthesize intelligible audio from Punjabi dataset using CISAMPA phoneme input

I would recommend not use the vits because its fairly old architecture , you should try Orpheus tts , use the base hindi model and finetune it on punjabi data