Puyuan Peng comments

Results 97 comments of


                                            Puyuan Peng

Added gradio app

> BTW, I had already used the new model this morning. Output is fairly similar. I have not tried with less batches. Any ideas on why sometimes the output is...

> Have make a Colab version of @zuev-stepan 's VoiceCraft fork. I think it should be as well part of the merge? > > https://github.com/Sewlell/VoiceCraft-gradio-colab Thanks! I have tested @zuev-stepan...

AttributeError: module 'torch' has no attribute 'compiler' and other various issue

Checkout quick start with docker, should works for windows https://github.com/jasonppy/VoiceCraft?tab=readme-ov-file#quickstart

AttributeError: module 'torch' has no attribute 'compiler' and other various issue

Thanks for your efforts, I'm unable to test issues regarding windows, but the docker solution seems to work for some people. Thanks for the feedback on audiocraft installation, I have...

Supported emotions

The model decides the emotion of it's generation based on the emotion of the prompt and the content it will generate. The model currently doesn't support hard coding emotion tag

RealEdit Dataset Release

I'll upload the dataset soon. If you want it earlier than that, send me an email

RealEdit Dataset Release

The meta data including the text are up [ReaEdit.txt](https://github.com/jasonppy/VoiceCraft/blob/master/RealEdit.txt), for audio files, they are under different licenses. for libritts I'll just upload them later, for gigaspeech and spotify I'll need...

Total duration of training dataset

About 20k hours

MFA not compatible with hugging face space?

Thanks! MFA is not really required and any forced alignment tool will do the job, for example some of the new ones include [NeMo](https://github.com/NVIDIA/NeMo/tree/main/tools/nemo_forced_aligner), [Wav2vec2](https://pytorch.org/audio/stable/tutorials/forced_alignment_tutorial.html).

some Voice editing problem

Thanks! I'm not sure I understand your question. If you meant to ask how to reduce unnatural pauses in the generation, try reducing the stop_repetition param to 1 or 2,...