Puyuan Peng

Results 97 comments of Puyuan Peng
trafficstars

they are finetuned from the giga830M/giga330M that's trained with causal masking. Right now the scripts are not uploaded to the repo yet.

I develop the model under the environment I listed in this github repo. I think most of the time higher versions work. For example, this setup also work: https://huggingface.co/spaces/pyp1/VoiceCraft_gradio/blob/main/README.md

Thanks! Really helpful contribution! 1. can't edit the last index of input utterance: Yes, in the edit mode, the model doesn't supports that. However, editing a span that contain the...

I see for this example original: "But when I had approached so near" new: "But had I approached so near" (substitute when->had, delete the had in the original) the reason...

regarding the issue of spans being two close: approach 1: set a threshold, say 2 words, and it the gap between two spans is less than or equal to 2...

If you want to do large scale testing https://github.com/jasonppy/VoiceCraft/blob/master/RealEdit.txt contains 310 speech editing examples, and there are 40 2-span edits examples. to interpret the example: ``` ah, but we'll talk...

The only drawback is that the forced alignment might not be perfect, and a larger margin gives room for such a mistake, also a large margin ensure modification of the...

Thanks! I'm not sure about this, partial finetuning or lora sounds good to me. But I think one needs to actually run experiments to get an answer.

I'm really not sure haha. keep me updated

What's the your huggingface_hub version (0.22.2 works for me)? ```bash python -c "import huggingface_hub; print(huggingface_hub.__version__)" ```