Mokshith Voodarla

Results 14 comments of Mokshith Voodarla

This is during inference, not training. It's likely some system package thing, I'll keep investigating.

I have this question as well. Looking at the inference code, it is unclear how I could drop-in replace running Grad-TTS with my own source WAV file. Any tips would...

Yes, I would like to have the ability to submit a source wav file with or without a text transcription of it, and be able to replace a word or...

yes! we've found a couple more improvements we are still in the process of making. but after we do that + clean up the code we will.

This was an awesome thread to read through. Any chance you could share some of this code as a PR or a fork @vishalbhavani?

We ended up making the model run 40% faster (when you consider all the pre-processing) and wrote about it [here](https://www.sievedata.com/blog/musetalk-real-time-high-quality-lip-sync-latent-space-inpainting). Would love to support cached pre-processing as well if that's...

Hey folks! Thanks for the notes here. We're still doing more active work around this model that we're [turning into a high quality pipeline](https://www.sievedata.com/functions/sieve/lipsync). More specifically, we're doing things like...

Join our Discord! Happy to share more active updates there. https://discord.com/invite/Pnh97rvRtD

We now support the new whisper-large-v3-turbo on Sieve! Use it via `sieve/speech_transcriber`: https://www.sievedata.com/functions/sieve/speech_transcriber Use `sieve/whisper` directly: https://www.sievedata.com/functions/sieve/whisper Just set `speed_boost` to True. API guide is under "Usage Guide" tab.